class BatchDataSource: (source)
Known subclasses: aligned.data_source.batch_data_source.CodableBatchDataSource
, aligned.data_source.model_predictor.PredictModelSource
Undocumented
Class Method | multi |
Undocumented |
Method | __hash__ |
Undocumented |
Method | all |
Undocumented |
Method | all |
Undocumented |
Method | all |
Undocumented |
Method | all |
Undocumented |
Method | depends |
Undocumented |
Async Method | feature |
Setup the code needed to represent the data source as a feature view |
Method | features |
Undocumented |
Method | filter |
Undocumented |
Async Method | freshness |
.table("my_table") .freshness() |
Method | job |
A key defining which sources can be grouped together in one request. |
Method | location |
Undocumented |
Async Method | schema |
Returns the schema for the data source |
Method | source |
An id that identifies a source from others. |
Method | tags |
Undocumented |
Method | transform |
Undocumented |
Method | with |
Undocumented |
Method | with |
Undocumented |
def multi_source_features_for(cls:
type[ T]
, facts: RetrivalJob
, requests: list[ tuple[ T, RetrivalRequest]]
) -> RetrivalJob
:
(source)
¶
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.in_mem_source.InMemorySource
, aligned.sources.lancedb.LanceDbTable
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.random_source.RandomDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
Undocumented
aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
Undocumented
RetrivalRequest
, start_date: datetime
, end_date: datetime
) -> RetrivalJob
:
(source)
¶
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.random_source.RandomDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
, aligned.sources.s3.AwsS3CsvDataSource
Undocumented
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.lancedb.LanceDbTable
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.random_source.RandomDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
, aligned.sources.s3.AwsS3CsvDataSource
Undocumented
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.random_source.RandomDataSource
Undocumented
aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
Setup the code needed to represent the data source as a feature view
```python FileSource.parquet("my_path.parquet").feature_view_code(view_name="my_view")
>>> """from aligned import FeatureView, String, Int64, Float
class MyView(FeatureView):
- metadata = FeatureView.metadata_with(
- name="Embarked", description="some description", batch_source=FileSource.parquest("my_path.parquet") stream_source=None,
)
Passenger_id = Int64() Survived = Int64() Pclass = Int64() Name = String() Sex = String() Age = Float() Sibsp = Int64() Parch = Int64() Ticket = String() Fare = Float() Cabin = String() Embarked = String()"""
```
- Returns:
- str: The code needed to setup a basic feature view
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
Undocumented
aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.lancedb.LanceDbTable
, aligned.sources.local.CsvFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
- my_table_freshenss = await (PostgreSQLConfig("DB_URL")
- .table("my_table") .freshness()
)
aligned.CustomMethodDataSource
, aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.in_mem_source.InMemorySource
, aligned.sources.lancedb.LanceDbTable
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.random_source.RandomDataSource
, aligned.sources.redshift.RedshiftSQLDataSource
, aligned.sources.s3.AwsS3CsvDataSource
, aligned.sources.s3.AwsS3ParquetDataSource
A key defining which sources can be grouped together in one request.
aligned.data_source.batch_data_source.FilteredDataSource
, aligned.data_source.batch_data_source.JoinAsofDataSource
, aligned.data_source.batch_data_source.JoinDataSource
, aligned.data_source.batch_data_source.LoadedAtSource
, aligned.data_source.batch_data_source.StackSource
, aligned.data_source.model_predictor.PredictModelSource
, aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobDeltaDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.local.CsvFileSource
, aligned.sources.local.DeltaFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
, aligned.sources.psql.PostgreSQLDataSource
, aligned.sources.random_source.RandomDataSource
Returns the schema for the data source
`python source = FileSource.parquet_at('test_data/titanic.parquet') schema = await source.schema() >>> {'passenger_id': FeatureType(name='int64'), ...} `
- Returns:
- dict[str, FeatureType]: A dictionary containing the column name and the feature type
Callable[ [ pl.LazyFrame], Awaitable[ pl.LazyFrame]] | Callable[ [ pl.LazyFrame], pl.LazyFrame]
) -> CodableBatchDataSource
:
(source)
¶
Undocumented
aligned.sources.azure_blob_storage.AzureBlobCsvDataSource
, aligned.sources.azure_blob_storage.AzureBlobParquetDataSource
, aligned.sources.azure_blob_storage.AzureBlobPartitionedParquetDataSource
, aligned.sources.in_mem_source.InMemorySource
, aligned.sources.local.CsvFileSource
, aligned.sources.local.ParquetFileSource
, aligned.sources.local.PartitionedParquetFileSource
Undocumented