class PartitionedParquetFileSource(CodableBatchDataSource, ColumnFeatureMappable, DataFileReference, WritableFeatureSource, Deletable): (source)
A source pointing to a Parquet file
Class Method | multi |
Undocumented |
Method | __hash__ |
Undocumented |
Method | all |
Undocumented |
Method | all |
Undocumented |
Async Method | delete |
Undocumented |
Async Method | feature |
Setup the code needed to represent the data source as a feature view |
Async Method | insert |
Undocumented |
Method | job |
A key defining which sources can be grouped together in one request. |
Async Method | overwrite |
Undocumented |
Async Method | schema |
Returns the schema for the data source |
Async Method | to |
Undocumented |
Async Method | to |
Undocumented |
Async Method | upsert |
Undocumented |
Method | with |
Undocumented |
Async Method | write |
Undocumented |
Class Variable | config |
Undocumented |
Class Variable | date |
Undocumented |
Class Variable | directory |
Undocumented |
Class Variable | mapping |
Undocumented |
Class Variable | partition |
Undocumented |
Class Variable | type |
Undocumented |
Property | to |
Undocumented |
Inherited from CodableBatchDataSource
:
Class Method | _deserialize |
Undocumented |
Method | _serialize |
Undocumented |
Inherited from BatchDataSource
(via CodableBatchDataSource
):
Method | all |
Undocumented |
Method | all |
Undocumented |
Method | depends |
Undocumented |
Method | features |
Undocumented |
Method | filter |
Undocumented |
Async Method | freshness |
.table("my_table") .freshness() |
Method | location |
Undocumented |
Method | source |
An id that identifies a source from others. |
Method | tags |
Undocumented |
Method | transform |
Undocumented |
Method | with |
Undocumented |
Inherited from ColumnFeatureMappable
(via CodableBatchDataSource
, BatchDataSource
):
Method | columns |
Undocumented |
Method | feature |
Undocumented |
Method | with |
Undocumented |
Inherited from DataFileReference
(via CodableBatchDataSource
, BatchDataSource
, ColumnFeatureMappable
):
Async Method | read |
Undocumented |
Async Method | to |
Undocumented |
Async Method | write |
Undocumented |
def multi_source_features_for(cls, facts:
RetrivalJob
, requests: list[ tuple[ ParquetFileSource, RetrivalRequest]]
) -> RetrivalJob
:
(source)
¶
Undocumented
RetrivalRequest
, start_date: datetime
, end_date: datetime
) -> RetrivalJob
:
(source)
¶
Undocumented
Setup the code needed to represent the data source as a feature view
```python FileSource.parquet("my_path.parquet").feature_view_code(view_name="my_view")
>>> """from aligned import FeatureView, String, Int64, Float
class MyView(FeatureView):
- metadata = FeatureView.metadata_with(
- name="Embarked", description="some description", batch_source=FileSource.parquest("my_path.parquet") stream_source=None,
)
Passenger_id = Int64() Survived = Int64() Pclass = Int64() Name = String() Sex = String() Age = Float() Sibsp = Int64() Parch = Int64() Ticket = String() Fare = Float() Cabin = String() Embarked = String()"""
```
- Returns:
- str: The code needed to setup a basic feature view
Returns the schema for the data source
`python source = FileSource.parquet_at('test_data/titanic.parquet') schema = await source.schema() >>> {'passenger_id': FeatureType(name='int64'), ...} `
- Returns:
- dict[str, FeatureType]: A dictionary containing the column name and the feature type