Skip to content

pandas ¤

channel_to_dataframe_decimated ¤

channel_to_dataframe_decimated(
    channel: Channel,
    start: str | datetime | IntegralNanosecondsUTC,
    end: str | datetime | IntegralNanosecondsUTC,
    *,
    buckets: int | None = None,
    resolution: int | None = None
) -> DataFrame

Retrieve the channel summary as a pandas.DataFrame, decimated to the given buckets or resolution.

Enter either the number of buckets or the resolution for the output. Resolution in picoseconds for picosecond-granularity dataset, nanoseconds otherwise.

channel_to_series ¤

channel_to_series(
    channel: Channel,
    start: datetime | IntegralNanosecondsUTC | None = None,
    end: datetime | IntegralNanosecondsUTC | None = None,
    relative_to: datetime | IntegralNanosecondsUTC | None = None,
    relative_resolution: _LiteralTimeUnit = "nanoseconds",
    *,
    enable_gzip: bool = True
) -> Series[Any]

Retrieve the channel data as a pandas.Series.

The index of the series is the timestamp of the data. The index name is "timestamp" and the series name is the channel name.

Use relative_to and relative_resolution to return timestamps relative to the given epoch.

Example:¤

s = channel_to_series(channel)
print(s.name, "mean:", s.mean())

datasource_to_dataframe ¤

datasource_to_dataframe(
    datasource: DataSource,
    channel_exact_match: Sequence[str] = (),
    channel_fuzzy_search_text: str = "",
    start: str | datetime | IntegralNanosecondsUTC | None = None,
    end: str | datetime | IntegralNanosecondsUTC | None = None,
    tags: dict[str, str] | None = None,
    enable_gzip: bool = True,
    *,
    channels: Sequence[Channel] | None = None,
    num_workers: int = 1,
    channel_batch_size: int = 20,
    relative_to: datetime | IntegralNanosecondsUTC | None = None,
    relative_resolution: _LiteralTimeUnit = "nanoseconds"
) -> DataFrame

Download a dataset to a pandas dataframe, optionally filtering for only specific channels of the dataset.


datasource: The datasource to download data from
channel_exact_match: Filter the returned channels to those whose names match all provided strings
    (case insensitive).
    For example, a channel named 'engine_turbine_rpm' would match against ['engine', 'turbine', 'rpm'],
    whereas a channel named 'engine_turbine_flowrate' would not!
channel_fuzzy_search_text: Filters the returned channels to those whose names fuzzily match the provided
    string.
channels: List of channels to fetch data for. If provided, supercedes search parameters of
    `channel_exact_match` and `channel_fuzzy_search_text`.
tags: Dictionary of tags to filter channels by
start: The minimum data updated time to filter channels by
end: The maximum data start time to filter channels by
enable_gzip: If true, use gzip when exporting data from Nominal. This will almost always make export
    faster and use less bandwidth.
num_workers: Use this many parallel processes for performing export requests against the backend. This should
    roughly be corresponding to the strength of your network connection, with 4-8 workers being more than
    sufficient to completely saturate most connections.
channel_batch_size: Number of channels to request at a time per worker thread. Reducing this number may allow
    fetching a larger time duration (i.e., `end` - `start`), depending on how synchronized the timing is amongst
    the requested channels. This is a result of a limit of 10_000_000 unique timestamps returned per request,
    so reducing the number of channels will allow for a larger time window if channels come in at different
    times (e.g. channel A has timestamps 100, 200, 300... and channel B has timestamps 101, 201, 301, ...).
    This is particularly useful when combined with num_workers when attempting to maximally utilize a machine.
relative_to: If provided, return timestamps relative to the given epoch time
relative_resolution: If providing timestamps in relative time, the resolution to use

A pandas dataframe whose index is the timestamp of the data, and column names match those of the selected
    channels.

Example:¤

rid = "..." # Taken from the UI or via the SDK
dataset = client.get_dataset(rid)
df = datasource_to_dataframe(dataset)
print(df.head())  # Show first few rows of data

upload_dataframe ¤

upload_dataframe(
    client: NominalClient,
    df: DataFrame,
    name: str,
    timestamp_column: str,
    timestamp_type: _AnyTimestampType,
    description: str | None = None,
    channel_name_delimiter: str | None = None,
    *,
    wait_until_complete: bool = True,
    labels: Sequence[str] = (),
    properties: Mapping[str, str] | None = None,
    tag_columns: Mapping[str, str] | None = None,
    tags: Mapping[str, str] | None = None
) -> Dataset

Create a dataset in the Nominal platform from a pandas.DataFrame.

Parameters:

  • client ¤

    (NominalClient) –

    Client instance to use for creating the dataset

  • df ¤

    (DataFrame) –

    Dataframe to create a dataset from

  • name ¤

    (str) –

    Name of the dataset to create, as well as filename for the uploaded "file".

  • timestamp_column ¤

    (str) –

    Name of the column containing timestamp information for the dataframe

  • timestamp_type ¤

    (_AnyTimestampType) –

    Type of the timestamp column, e.g. epoch_seconds, iso8601, etc.

  • description ¤

    (str | None, default: None ) –

    Description of the dataset to create

  • channel_name_delimiter ¤

    (str | None, default: None ) –

    Delimiter to use for folding channel view to a tree view.

  • wait_until_complete ¤

    (bool, default: True ) –

    If true, wait until all data has been ingested successfully before returning

  • labels ¤

    (Sequence[str], default: () ) –

    String labels to apply to the created dataset

  • properties ¤

    (Mapping[str, str] | None, default: None ) –

    String key-value pairs to apply to the created dataset

  • tag_columns ¤

    (Mapping[str, str] | None, default: None ) –

    Mapping of column name => tag key to apply to the respective rows of data

  • tags ¤

    (Mapping[str, str] | None, default: None ) –

    Mapping of key-value pairs to apply uniformly as tags to all data within the dataframe.

Returns:

upload_dataframe_to_dataset ¤

upload_dataframe_to_dataset(
    dataset: Dataset,
    df: DataFrame,
    timestamp_column: str,
    timestamp_type: _AnyTimestampType,
    *,
    wait_until_complete: bool = True,
    file_name: str | None = None,
    tag_columns: Mapping[str, str] | None = None,
    tags: Mapping[str, str] | None = None
) -> None

Upload a pandas dataframe to an existing dataset as if it were a gzipped-CSV file

Parameters:

  • dataset ¤

    (Dataset) –

    Dataset to upload the dataframe to

  • df ¤

    (DataFrame) –

    Dataframe to upload to the dataset

  • timestamp_column ¤

    (str) –

    Column containing timestamps to use for their respective rows

  • timestamp_type ¤

    (_AnyTimestampType) –

    Type of timestamp, e.g., epoch_seconds, iso8601, etc.

  • wait_until_complete ¤

    (bool, default: True ) –

    If true, block until data has been ingested

  • file_name ¤

    (str | None, default: None ) –

    Manually override the name of the filename given to the uploaded data. If not provided, defaults to using the dataset's name

  • tag_columns ¤

    (Mapping[str, str] | None, default: None ) –

    Mapping of column names => tag keys to use for their respective rows.

  • tags ¤

    (Mapping[str, str] | None, default: None ) –

    Mapping of key-value pairs to apply uniformly as tags to all data within the dataframe.