DataFrame

Struct DataFrame 

Source
pub struct DataFrame { /* private fields */ }
Expand description

A DataFrame-centric pipeline compiled into a lazy plan.

The public API stays in this crate’s own types. The current engine implementation is Polars, but callers do not need to depend on Polars types.

Implementations§

Source§

impl DataFrame

Source

pub fn from_dataset(ds: &DataSet) -> IngestionResult<Self>

Build a pipeline starting from an in-memory DataSet.

Note: this converts the dataset into a Polars DataFrame first. The transformations after that are planned lazily.

Source

pub fn filter(self, predicate: Predicate) -> IngestionResult<Self>

Add a filter predicate.

Source

pub fn multiply_f64(self, column: &str, factor: f64) -> IngestionResult<Self>

Multiply a Float64 column by a constant factor (nulls remain null).

Source

pub fn add_f64(self, column: &str, delta: f64) -> IngestionResult<Self>

Add a constant Float64 value to a column (nulls remain null).

Source

pub fn with_mul_f64( self, name: &str, source: &str, factor: f64, ) -> IngestionResult<Self>

Add a derived Float64 column: name = source * factor (nulls remain null).

Source

pub fn with_add_f64( self, name: &str, source: &str, delta: f64, ) -> IngestionResult<Self>

Add a derived Float64 column: name = source + delta (nulls remain null).

Source

pub fn select(self, columns: &[&str]) -> IngestionResult<Self>

Select a subset of columns (in the provided order).

Source

pub fn rename(self, pairs: &[(&str, &str)]) -> IngestionResult<Self>

Rename columns.

This uses Polars’ rename(..., strict=true) behavior: all from columns must exist.

Source

pub fn cast(self, column: &str, to: DataType) -> IngestionResult<Self>

Cast a column to a target type.

Note: cast errors (e.g. invalid parses) surface at collect() time.

Source

pub fn cast_with_mode( self, column: &str, to: DataType, mode: CastMode, ) -> IngestionResult<Self>

Cast a column with an explicit mode (strict vs lossy).

Source

pub fn drop(self, columns: &[&str]) -> IngestionResult<Self>

Drop columns by name.

Source

pub fn fill_null(self, column: &str, value: Value) -> IngestionResult<Self>

Fill nulls in a column with a literal.

Source

pub fn with_literal(self, name: &str, value: Value) -> IngestionResult<Self>

Add a derived column with a literal value.

Source

pub fn group_by(self, keys: &[&str], aggs: &[Agg]) -> IngestionResult<Self>

Group rows by keys and compute aggregations.

Source

pub fn join( self, other: DataFrame, left_on: &[&str], right_on: &[&str], how: JoinKind, ) -> IngestionResult<Self>

Join this pipeline with another DataFrame on key columns.

Note: join planning is infallible; missing-column errors surface at collect() time.

Source

pub fn collect(self) -> IngestionResult<DataSet>

Collect the pipeline into an in-memory DataSet.

Source

pub fn collect_with_schema(self, schema: &Schema) -> IngestionResult<DataSet>

Collect the pipeline into an in-memory DataSet, enforcing an explicit output schema.

Source

pub fn reduce( self, column: &str, op: ReduceOp, ) -> IngestionResult<Option<Value>>

Reduce a column using a built-in ReduceOp (Polars-backed).

Returns None if column does not exist (aligned with [crate::processing::reduce]).

Source

pub fn sum(self, column: &str) -> IngestionResult<Option<Value>>

Reduce a numeric column by summing values (nulls ignored; all-null -> null).

Returns None if column does not exist (aligned with processing::reduce).

Source

pub fn feature_wise_mean_std( self, columns: &[&str], std_kind: VarianceKind, ) -> IngestionResult<Vec<(String, FeatureMeanStd)>>

Single Polars collect: for each column, mean and standard deviation (std_kind maps to Polars ddof). Columns are cast to Float64 first (aligned with scalar reduces).

Returns an error if any column name is missing from the lazy schema.

Trait Implementations§

Source§

impl Clone for DataFrame

Source§

fn clone(&self) -> DataFrame

Returns a duplicate of the value. Read more
1.0.0 · Source§

fn clone_from(&mut self, source: &Self)

Performs copy-assignment from source. Read more

Auto Trait Implementations§

Blanket Implementations§

Source§

impl<T> Any for T
where T: 'static + ?Sized,

Source§

fn type_id(&self) -> TypeId

Gets the TypeId of self. Read more
Source§

impl<T> Borrow<T> for T
where T: ?Sized,

Source§

fn borrow(&self) -> &T

Immutably borrows from an owned value. Read more
Source§

impl<T> BorrowMut<T> for T
where T: ?Sized,

Source§

fn borrow_mut(&mut self) -> &mut T

Mutably borrows from an owned value. Read more
Source§

impl<T> CloneToUninit for T
where T: Clone,

Source§

unsafe fn clone_to_uninit(&self, dest: *mut u8)

🔬This is a nightly-only experimental API. (clone_to_uninit)
Performs copy-assignment from self to dest. Read more
Source§

impl<T> DynClone for T
where T: Clone,

Source§

fn __clone_box(&self, _: Private) -> *mut ()

Source§

impl<T> From<T> for T

Source§

fn from(t: T) -> T

Returns the argument unchanged.

§

impl<T> Instrument for T

§

fn instrument(self, span: Span) -> Instrumented<Self>

Instruments this type with the provided [Span], returning an Instrumented wrapper. Read more
§

fn in_current_span(self) -> Instrumented<Self>

Instruments this type with the current Span, returning an Instrumented wrapper. Read more
Source§

impl<T, U> Into<U> for T
where U: From<T>,

Source§

fn into(self) -> U

Calls U::from(self).

That is, this conversion is whatever the implementation of From<T> for U chooses to do.

Source§

impl<T> IntoEither for T

Source§

fn into_either(self, into_left: bool) -> Either<Self, Self>

Converts self into a Left variant of Either<Self, Self> if into_left is true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
Source§

fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
where F: FnOnce(&Self) -> bool,

Converts self into a Left variant of Either<Self, Self> if into_left(&self) returns true. Converts self into a Right variant of Either<Self, Self> otherwise. Read more
§

impl<T> Key for T
where T: Clone,

§

fn align() -> usize

The alignment necessary for the key. Must return a power of two.
§

fn size(&self) -> usize

The size of the key in bytes.
§

unsafe fn init(&self, ptr: *mut u8)

Initialize the key in the given memory location. Read more
§

unsafe fn get<'a>(ptr: *const u8) -> &'a T

Get a reference to the key from the given memory location. Read more
§

unsafe fn drop_in_place(ptr: *mut u8)

Drop the key in place. Read more
§

impl<T> Pointable for T

§

const ALIGN: usize

The alignment of pointer.
§

type Init = T

The type for initializers.
§

unsafe fn init(init: <T as Pointable>::Init) -> usize

Initializes a with the given initializer. Read more
§

unsafe fn deref<'a>(ptr: usize) -> &'a T

Dereferences the given pointer. Read more
§

unsafe fn deref_mut<'a>(ptr: usize) -> &'a mut T

Mutably dereferences the given pointer. Read more
§

unsafe fn drop(ptr: usize)

Drops the object pointed to by the given pointer. Read more
§

impl<T> PolicyExt for T
where T: ?Sized,

§

fn and<P, B, E>(self, other: P) -> And<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] only if self and other return Action::Follow. Read more
§

fn or<P, B, E>(self, other: P) -> Or<T, P>
where T: Policy<B, E>, P: Policy<B, E>,

Create a new Policy that returns [Action::Follow] if either self or other returns Action::Follow. Read more
Source§

impl<T> ToOwned for T
where T: Clone,

Source§

type Owned = T

The resulting type after obtaining ownership.
Source§

fn to_owned(&self) -> T

Creates owned data from borrowed data, usually by cloning. Read more
Source§

fn clone_into(&self, target: &mut T)

Uses borrowed data to replace owned data, usually by cloning. Read more
Source§

impl<T, U> TryFrom<U> for T
where U: Into<T>,

Source§

type Error = Infallible

The type returned in the event of a conversion error.
Source§

fn try_from(value: U) -> Result<T, <T as TryFrom<U>>::Error>

Performs the conversion.
Source§

impl<T, U> TryInto<U> for T
where U: TryFrom<T>,

Source§

type Error = <U as TryFrom<T>>::Error

The type returned in the event of a conversion error.
Source§

fn try_into(self) -> Result<U, <U as TryFrom<T>>::Error>

Performs the conversion.
§

impl<V, T> VZip<V> for T
where V: MultiLane<T>,

§

fn vzip(self) -> V

§

impl<T> WithSubscriber for T

§

fn with_subscriber<S>(self, subscriber: S) -> WithDispatch<Self>
where S: Into<Dispatch>,

Attaches the provided Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

fn with_current_subscriber(self) -> WithDispatch<Self>

Attaches the current default Subscriber to this type, returning a [WithDispatch] wrapper. Read more
§

impl<T> PlanCallbackArgs for T

§

impl<T> PlanCallbackOut for T