Data Functions
Using the Studio Data SDK, you can manage your dataset records with several data endpoints.
Upload Dataset
Create a dataset from a file upload.
data_sdk.upload_file(
file='new_file.csv',
name='Dataset name',
media_type=MediaType.CSV,
description='Dataset description')
fsq-data-sdk upload-file \
--name "Dataset name" \
--desc "Dataset description" \
--media-type text/csv \
new_file.csv
curl -X POST https://data-api.foursquare.com/v1/datasets/data?name=My+Dataset \
-H 'Authorization: Bearer <token>' \
-H 'Content-Type: text/csv' \
--data-binary '@/path/to/my_dataset.csv'
Learn more in the SDK reference
Upload Dataframe
You can use the Python SDK package to upload a Pandas or GeoPandas dataframe.
data_sdk.upload_dataframe(
dataframe,
name='Dataset name',
description='Dataset description')
Learn more in the SDK reference
Create External Dataset
Create a that references an external data source.
Create an external dataset record referencing a dataset by URL. External datasets are loaded from source every time, and will not be stored in our system.
If the URL references a cloud storage object, e.g. with the s3:// or gcs:// protocol, and that URL requires authentication, you can include a data connector id referencing a connector with appropriate privileges to read that object.
data_sdk.create_external_dataset(
name = "test-external-dataset",
description = "my external dataset",
source = "https://s3data.example.com/data-source",
connector = "<data-connector-uuid>"
)
fsq-data-sdk create-external-dataset \
--source "test-external-dataset" \
--name "my external dataset", \
--description "https://s3data.example.com/data-source" \
--connector-id "<data-connector-uuid>"
curl POST 'https://data-api.foursquare.com/catalog/v1/datasets' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer <TOKEN>' \
--data
'{
"name": "My S3 Dataset",
"type": "externally-hosted",
"metadata": {
"source": "s3://my-bucket/path/to/data.parquet"
},
"dataConnectionId": "<SOME_ID>"
}'
Learn more in the SDK reference
Generate Vector Tiles
Create Vector Tiles by specifying a source GeoJSON (.geojson), CSV .csv
or FlatGeobuf (.fgb
) file, and optionally, a target dataset.
data_sdk.generate_vectortile(
source="source_dataset_uuid",
target=None
)
fsq-data-sdk generate-vectortile \
--source "source-dataset-uuid" \
--target "optional-target-uuid"
curl --request POST \
--url https://data-api.foursquare.com/v1/datasets/vectortile \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '
{
"source": "source-dataset-uuid",
"target": "target-dataset-uuid"
}
'
Learn more in the SDK reference
Get Dataset Metadata
Get a dataset record. This dataset record only includes the dataset's metadata—not the data itself. Pass the UUID or dataset
object to receive the dataset's record as a JSON object.
# Get dataset metadata by UUID
data_sdk.get_dataset_by_id("<uuid>")
# Get dataset record by dataset object
## List datasets
datasets = data_sdk.list_datasets()
dataset = datasets[0]
## Get dataset record by dataset object
data_sdk.get_dataset_by_id(dataset)
fsq-data-sdk get-map <uuid>
curl -X GET https://data-api.foursquare.com/v1/datasets/<uuid> \
-H 'Authorization: Bearer <token>'
Learn more in the SDK reference
Download Dataset
Download data for a given dataset. Pass the UUID or dataset
object to get to receive the dataset. Provide output_file
to write the dataset data to, or leave empty to return a bytes
object with the dataset data.
# Download dataset record by dataset object
## List datasets
datasets = data_sdk.list_datasets()
dataset = datasets[0]
# Download to local file
data_sdk.download_dataset(dataset, output_file='output.csv')
# Download to buffer
buffer = data_sdk.download_dataset(dataset)
fsq-data-sdk download-dataset --dataset-id <uuid> --output-file output.csv
curl -X GET https://data-api.foursquare.com/v1/datasets/<uuid>/data \
-H 'Authorization: Bearer <token>'
You can use the Python SDK package to download the data as a Pandas or GeoPandas dataframe.
If the original dataset was a CSV
file, a pandas DataFrame
will be returned. If it was a GeoJSON
file, a geopandas GeoDataFrame
will be returned.
# Download dataset record by dataset object
datasets = data_sdk.list_datasets()
dataset = datasets[0]
# Download to a dataframe
df = data_sdk.download_dataframe(dataset)
Learn more in the SDK reference
Update Dataset
Update an existing dataset with a binary data upload. Can also update the name or description of the dataset. Pass the UUID or dataset
object of the dataset to update.
# Select a dataset to update
dataset = datasets[0]
# Update the dataset using file upload
data_sdk.update_dataset(
dataset,
name='New name'
description='New description'
file='new_file.csv',
media_type=MediaType.CSV)
fsq-data-sdk update-dataset --dataset-id <id> --media-type <media_type> --file <path> --name <name> --description <description>
This functions can make two HTTP calls, one for the dataset data (coming from a file) and one for the metadata (name and descriptions)
curl -X PUT https://data-api.foursquare.com/v1/datasets/<uuid>/data HTTP/1.1 \
-H 'Authorization: Bearer <token>'
-H 'Content-Type: text/csv' \
--data-binary '@/path/to/my_dataset.csv'
curl --request PUT \
--url https://data-api.foursquare.com/v1//v1/datasets/<uuid> \
--header 'accept: application/json' \
--header 'content-type: application/json' \
--data '
{
"name": "New name",
"description": "New description"
}
You can use the Python SDK package to update the data with a Pandas or GeoPandas dataframe.
# Update the dataset with dataframe
data_sdk.upload_dataframe(
dataframe,
dataset=dataset,
name='Dataset name',
description='Dataset description')
Learn more in the SDK reference
Delete Dataset
Delete a dataset record. Pass the UUID of the dataset to delete the dataset record. This will delete all data associated with the dataset.
Warning!
This operation cannot be undone. If you delete a dataset that appears in maps, the dataset will be removed from the map. This may cause the map to render incorrectly.
# Delete dataset by dataset object
## Select dataset
datasets = data_sdk.list_datasets()
dataset = datasets[0]
## Delete selected Dataset
data_sdk.delete_dataset(dataset)
# Delete dataset by UUID
data_sdk.delete_dataset("<UUID>")
fsq-data-sdk delete-dataset --dataset-id <uuid>
curl -X DELETE https://data-api.foursquare.com/v1/datasets/<uuid> HTTP/1.1 \
-H 'Authorization: Bearer <token>'
Learn more in the SDK reference
List Datasets
List all dataset records on the authorized account.
datasets = data_sdk.list_datasets()
fsq-data-sdk list-datasets
curl -X GET https://data-api.foursquare.com/v1/datasets HTTP/1.1 \
-H 'Authorization: Bearer <token>'
If you are part of an organization, you can pass the organization parameter to get all dataset records for the organization of the authorized account.
datasets = data_sdk.list_datasets(organization=True)
fsq-data-sdk list-datasets --organization
curl -X GET https://data-api.foursquare.com/v1/datasets/for-organization HTTP/1.1 \
-H 'Authorization: Bearer <token>'
This function returns a list of dataset
objects. You can use these dataset
objects in other functions.
Learn more in the SDK reference
Query Functions
A subset of the Data SDKs data functions, these endpoints allow users to query data from databases and data lakes added to Studio via Data Connectors.
List Data Connectors
List all data connectors added to the authorized account.
Parameter | Type | Description |
---|---|---|
organization | boolean | If True , list data connectors for organization of authenticated user. |
# List data connectors associated with user account
data_connectors = data_sdk.list_data_connectors()
fsq-data-sdk list-data-connectors
curl -X GET https://data-api.foursquare.com/v1/data-connections HTTP/1.1
If you are part of an organization, you can pass the organization parameter to get all data connectors for the organization of the authorized account.
# List data connectors associated with organization
data_connectors = data_sdk.list_data_connectors(organization=True)
fsq-data-sdk list-data-connectors --organization
curl -X GET https://data-api.foursquare.com/v1/data-connections/for-organization HTTP/1.1
This function returns an array of data connector objects that contain data connector metadata. This information, in particular the id
field, can be used in execute_query
and create_query_dataset
.
[DataConnector(id="...", name="connector", description="desc", type=DataConnectorType.POSTGRES, ...)]
list_data_connectors reference
Execute Query
Execute a query against a data connector, returning a dataframe with the results of the query, or None
of the output was written to a file.
Parameter | Description |
---|---|
connector | Required. The data connector to use, or its UUID. |
query | Required. The SQL query. |
output_file | The path to write the query output to. |
output_format | The format in which to write the output. |
df = data_sdk.execute_query(
example_data_connector.id,
"select * from table;"
)
fsq-data-sdk execute-query --connector-id <connector uuid> --query <SQL query to use>
curl -X POST https://data-api.foursquare.com/v1/query/gateway/data-queries HTTP/1.1
Learn more in the SDK reference
Create Dataset from Query
Create a dataset from a query.
Parameter | Description |
---|---|
connector | Required. The data connector to use, or its UUID. |
query | Required. The SQL query. |
name | Name of the dataset record. |
description | Description for the dataset record. |
create_query_dataset(query_dataset.id, "select * from table;", "query-dataset", "sample-description")
fsq-data-sdk create-query-dataset --connector-id <connector uuid> --query <SQL query to use> --name <name for the new queried dataset> --description <description of the queried dataset>
curl -X POST https://data-api.foursquare.com/v1/datasets/data-query HTTP/1.1
Learn more in the SDK reference
Updated 5 months ago