pandas
¤
channel_to_dataframe_decimated
¤
channel_to_dataframe_decimated(
channel: Channel,
start: str | datetime | IntegralNanosecondsUTC,
end: str | datetime | IntegralNanosecondsUTC,
*,
buckets: int | None = None,
resolution: int | None = None
) -> DataFrame
Retrieve the channel summary as a pandas.DataFrame, decimated to the given buckets or resolution.
Enter either the number of buckets or the resolution for the output. Resolution in picoseconds for picosecond-granularity dataset, nanoseconds otherwise.
channel_to_series
¤
channel_to_series(
channel: Channel,
start: datetime | IntegralNanosecondsUTC | None = None,
end: datetime | IntegralNanosecondsUTC | None = None,
relative_to: datetime | IntegralNanosecondsUTC | None = None,
relative_resolution: _LiteralTimeUnit = "nanoseconds",
*,
enable_gzip: bool = True
) -> Series[Any]
Retrieve the channel data as a pandas.Series.
The index of the series is the timestamp of the data. The index name is "timestamp" and the series name is the channel name.
Use relative_to
and relative_resolution
to return timestamps relative to the given epoch.
Example:¤
s = channel_to_series(channel)
print(s.name, "mean:", s.mean())
datasource_to_dataframe
¤
datasource_to_dataframe(
datasource: DataSource,
channel_exact_match: Sequence[str] = (),
channel_fuzzy_search_text: str = "",
start: str | datetime | IntegralNanosecondsUTC | None = None,
end: str | datetime | IntegralNanosecondsUTC | None = None,
tags: dict[str, str] | None = None,
enable_gzip: bool = True,
*,
channels: Sequence[Channel] | None = None,
num_workers: int = 1,
channel_batch_size: int = 20,
relative_to: datetime | IntegralNanosecondsUTC | None = None,
relative_resolution: _LiteralTimeUnit = "nanoseconds"
) -> DataFrame
Download a dataset to a pandas dataframe, optionally filtering for only specific channels of the dataset.
datasource: The datasource to download data from
channel_exact_match: Filter the returned channels to those whose names match all provided strings
(case insensitive).
For example, a channel named 'engine_turbine_rpm' would match against ['engine', 'turbine', 'rpm'],
whereas a channel named 'engine_turbine_flowrate' would not!
channel_fuzzy_search_text: Filters the returned channels to those whose names fuzzily match the provided
string.
channels: List of channels to fetch data for. If provided, supercedes search parameters of
`channel_exact_match` and `channel_fuzzy_search_text`.
tags: Dictionary of tags to filter channels by
start: The minimum data updated time to filter channels by
end: The maximum data start time to filter channels by
enable_gzip: If true, use gzip when exporting data from Nominal. This will almost always make export
faster and use less bandwidth.
num_workers: Use this many parallel processes for performing export requests against the backend. This should
roughly be corresponding to the strength of your network connection, with 4-8 workers being more than
sufficient to completely saturate most connections.
channel_batch_size: Number of channels to request at a time per worker thread. Reducing this number may allow
fetching a larger time duration (i.e., `end` - `start`), depending on how synchronized the timing is amongst
the requested channels. This is a result of a limit of 10_000_000 unique timestamps returned per request,
so reducing the number of channels will allow for a larger time window if channels come in at different
times (e.g. channel A has timestamps 100, 200, 300... and channel B has timestamps 101, 201, 301, ...).
This is particularly useful when combined with num_workers when attempting to maximally utilize a machine.
relative_to: If provided, return timestamps relative to the given epoch time
relative_resolution: If providing timestamps in relative time, the resolution to use
A pandas dataframe whose index is the timestamp of the data, and column names match those of the selected
channels.
Example:¤
rid = "..." # Taken from the UI or via the SDK
dataset = client.get_dataset(rid)
df = datasource_to_dataframe(dataset)
print(df.head()) # Show first few rows of data
upload_dataframe
¤
upload_dataframe(
client: NominalClient,
df: DataFrame,
name: str,
timestamp_column: str,
timestamp_type: _AnyTimestampType,
description: str | None = None,
channel_name_delimiter: str | None = None,
*,
wait_until_complete: bool = True,
labels: Sequence[str] = (),
properties: Mapping[str, str] | None = None,
tag_columns: Mapping[str, str] | None = None,
tags: Mapping[str, str] | None = None
) -> Dataset
Create a dataset in the Nominal platform from a pandas.DataFrame.
Parameters:
-
client
¤NominalClient
) –Client instance to use for creating the dataset
-
df
¤DataFrame
) –Dataframe to create a dataset from
-
name
¤str
) –Name of the dataset to create, as well as filename for the uploaded "file".
-
timestamp_column
¤str
) –Name of the column containing timestamp information for the dataframe
-
timestamp_type
¤_AnyTimestampType
) –Type of the timestamp column, e.g. epoch_seconds, iso8601, etc.
-
description
¤str | None
, default:None
) –Description of the dataset to create
-
channel_name_delimiter
¤str | None
, default:None
) –Delimiter to use for folding channel view to a tree view.
-
wait_until_complete
¤bool
, default:True
) –If true, wait until all data has been ingested successfully before returning
-
labels
¤Sequence[str]
, default:()
) –String labels to apply to the created dataset
-
properties
¤Mapping[str, str] | None
, default:None
) –String key-value pairs to apply to the created dataset
-
tag_columns
¤Mapping[str, str] | None
, default:None
) –Mapping of column name => tag key to apply to the respective rows of data
-
tags
¤Mapping[str, str] | None
, default:None
) –Mapping of key-value pairs to apply uniformly as tags to all data within the dataframe.
Returns:
-
Dataset
–Created dataset
upload_dataframe_to_dataset
¤
upload_dataframe_to_dataset(
dataset: Dataset,
df: DataFrame,
timestamp_column: str,
timestamp_type: _AnyTimestampType,
*,
wait_until_complete: bool = True,
file_name: str | None = None,
tag_columns: Mapping[str, str] | None = None,
tags: Mapping[str, str] | None = None
) -> None
Upload a pandas dataframe to an existing dataset as if it were a gzipped-CSV file
Parameters:
-
dataset
¤Dataset
) –Dataset to upload the dataframe to
-
df
¤DataFrame
) –Dataframe to upload to the dataset
-
timestamp_column
¤str
) –Column containing timestamps to use for their respective rows
-
timestamp_type
¤_AnyTimestampType
) –Type of timestamp, e.g., epoch_seconds, iso8601, etc.
-
wait_until_complete
¤bool
, default:True
) –If true, block until data has been ingested
-
file_name
¤str | None
, default:None
) –Manually override the name of the filename given to the uploaded data. If not provided, defaults to using the dataset's name
-
tag_columns
¤Mapping[str, str] | None
, default:None
) –Mapping of column names => tag keys to use for their respective rows.
-
tags
¤Mapping[str, str] | None
, default:None
) –Mapping of key-value pairs to apply uniformly as tags to all data within the dataframe.