blueetl.extract.base¶
Base extractor.
Classes
|
Base extractor class. |
- class blueetl.extract.base.BaseExtractor(df: DataFrame, cached: bool, filtered: bool)¶
Bases:
ABC
Base extractor class.
Initialize the extractor.
- Parameters:
df – Pandas DataFrame containing the extracted data.
cached – True if the data have been extracted from the cache, False otherwise.
filtered – True if the data have been filtered using a custom query, False otherwise.
- property df: DataFrame¶
Return the internally wrapped dataframe.
- classmethod from_pandas(df: DataFrame, query: dict | None = None, cached: bool = True) ExtractorT ¶
Return a new object from the given dataframe.
If a query is specified, it’s passed to
etl.q
and applied as a filter.It can be overridden together with
to_pandas
if some columns are not serializable.- Parameters:
df – dataframe to load.
query – optional filter dictionary, passed to
etl.q
.cached – True if the data is loaded from the cache, False otherwise.
- Returns:
a new extractor instance.
- to_pandas() DataFrame ¶
Return a dataframe that can be serialized and stored to disk.
It should be possible to call
from_pandas
with the returned dataframe to create an equivalent object.It can be overridden together with
from_pandas
if some columns are not serializable.- Returns:
serializable dataframe.