API reference¶
Opening files¶
- open(path, mode='r')¶
Open a BigWig or BigBed file for reading or writing.
- Parameters:
path_url_or_file_like (
strorfile-like object) – The path to a file or an http url for a remote file as a string, or a Python file-like object withreadandseekmethods.mode (
Literal[``”r”, ``"w"],optional [default:"r"]) – The mode to open the file in. If not provided, it will default to read. “r” will open a bigWig/bigBed for reading but will not allow writing. “w” will open a bigWig/bigBed for writing but will not allow reading.
- Returns:
The object for reading or writing the BigWig or BigBed file.
- Return type:
BigWigWriteorBigBedWriteorBBIRead
Notes
For writing, only a file path is currently accepted.
If passing a file-like object, concurrent reading of different intervals is not supported and may result in incorrect behavior.
Reading¶
- class BBIReader(rust_reader)¶
Interface for reading a BigWig or BigBed file.
Returned by
open()in read mode. Use the methods below to query chromosomes, intervals, records, zoom levels, and summary statistics, or extract values as NumPy arrays. Supports the context-manager protocol.- average_over_bed(bed, names=None, stats=None)¶
Gets the average values from a bigWig over the entries of a bed file.
- Parameters:
bed (
strorPath) – The path to the bed.names (
boolorint, optional) –If
None, then no name is returned and the return value is only the statistics value (see the stats parameter).If
True, then each return value will be a 2-length tuple of the value of column 4 and the statistics value.If
False, then each return value will be a 2-length tuple of the interval in the format{chrom}:{start}-{end}and the statistics value.If
0, then each return value will match as ifFalsewas passed.If a
1+, then each return value will be a tuple of the value of column of this parameter (1-based) and the statistics value.stats (
strorList[str], optional) –Calculate specific statistics for each bed entry.
If not specified, mean will be returned.
If
"all"is specified, all summary statistics are returned in a named tuple.If a single statistic is provided as a string, that statistic is returned as a float or int depending on the statistic.
If a list of statistics are provided, a tuple is returned containing those statistics, in order.
- Possible statistics are:
size: Size of bed entry (int)
bases: Bases covered by bigWig (int)
sum: Sum of values over all bases covered (float)
mean0: Average over bases with non-covered bases counting as zeroes (float)
mean or None: Average over just covered bases (float)
min: Minimum over all bases covered (float)
max: Maximum over all bases covered (float)
- Return type:
Generatoroffloatortuple.
Notes
If no
namefield is specified, returns a generator of statistics (either floats or tuples, as specified by thestatsfield). If anamecolumn is specified, returns a generator of 2-length tuples of the form({name}, {average}). Importantly, if the statistics value is itself a tuple, then that tuple will be nested as the second value of the outer tuple.
- chroms(chrom=None)¶
Return the names of chromosomes in a BBI file and their lengths.
- info()¶
Return a dict of information about the BBI file.
- records(chrom, start=None, end=None)¶
Return the records of a given range on a chromosome.
The result is an iterator of tuples. For BigWigs, these tuples are in the format (start: int, end: int, value: float). For BigBeds, these tuples are in the format (start: int, end: int, …), where the “rest” fields are split by whitespace.
- Parameters:
chrom (
str) – Name of the chromosome.start (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.end (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.
- Returns:
An iterator of tuples in the format (start: int, end: int, value: float) for BigWigs, or (start: int, end: int, *rest) for BigBeds.
- Return type:
Notes
Missing values in BigWigs will results in non-contiguous records.
See also
zoom_recordsGet the zoom records of a given range on a chromosome.
valuesGet the values of a given range on a chromosome.
- sql(parse=False)¶
Return the autoSql schema definition of this BBI file.
For BigBeds, this schema comes directly from the autoSql string stored in the file. For BigWigs, the schema generated describes a bedGraph file.
- Parameters:
parse (
bool,optional [default:False]) – If True, return the schema as a dictionary. If False, return the schema as a string. Default is False.- Returns:
schema – The autoSql schema of the BBI file. If
parseis True, the schema is returned as a dictionary of the format:{ "name": <declared name>, "comment": <declaration coment>, "fields": [(<field name>, <field type>, <field comment>), ...], }
- Return type:
- values(chrom, start, end, bins=None, summary='mean', exact=False, uncovered=None, oob=nan, fillna=None, arr=None)¶
Return the values of a given range on a chromosome as a numpy array.
For BigWigs, the returned values or summary statistics are derived from the unique signal values associated with each base.
For BigBeds, the returned values or summary statistics instead are derived from the number of BED intervals overlapping each base.
- Parameters:
chrom (
str) – Name of the chromosome.start (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.end (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.bins (
int, optional) – If provided, the query interval will be divided into equally spaced bins and the values in each bin will be interpolated or summarized. If not provided, the values will be returned for each base.summary (
str,optional [default:"mean"]) – The summary statistic to use. One ofmean,std,min,max,sum,sum_squares,bases_covered,bin_covered.exact (
bool,optional [default:False]) – If True andbinsis specified, return exact summary statistic values instead of interpolating from the optimal zoom level. Default is False.uncovered (
floatorNone,optional [default:None]) – The value assigned to all uncovered bases. IfNone, uncovered bases are excluded from summary statistic calculations, and empty positions or bins will be returned as NaN (subject tofillna). To treat uncovered bases as having a value of zero in summary statistics (like UCSC’smean0) set this parameter to0.0. Empty positions or bins will also be returned as0.0. Other finite values are also valid and will be used in the same way. This parameter is ignored in the cases ofbases_coveredandbin_coveredsummaries since they exclude uncovered bases by definition.oob (
float,optional [default:NaN]) – Fill-in value for out-of-bounds regions. Default is NaN.fillna (
floatorNone,optional [default:None]) – Post-rasterization fill applied to in-bounds positions or bins that are returned as NaN due to being empty. DefaultNoneleaves NaN values untouched.arr (
numpy.ndarray, optional) – If provided, the values will be written to this array or array view. The array must be of the correct size and type.
- Returns:
The signal values of the bigwig or bigbed in the specified range.
- Return type:
Notes
A BigWig file encodes a step function, and the value at a base is given by the signal value of the unique interval that contains that base.
A BigBed file encodes a collection of (possibly overlapping) intervals which may or may not be associated with quantitative scores. The “value” at given base used here summarizes the number of intervals overlapping that base, not any particular score.
If a number of bins is requested and
exactis False, the summarized data is interpolated from the closest available zoom level. If you need accurate summary data and are okay with small trade-off in speed, setexactto True.See also
recordsGet the records of a given range on a chromosome.
zoom_recordsGet the zoom records of a given range on a chromosome.
- zoom_records(reduction_level, chrom, start=None, end=None)¶
Return the zoom records of a given range on a chromosome for a given zoom level.
The result is an iterator of tuples. These tuples are in the format (start: int, end: int, summary: dict).
- Parameters:
reduction_level (
int) – The zoom level to use, as a resolution in bases. Use thezoomsmethod to get a list of available zoom levels.chrom (
str) – Name of the chromosome.start (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.end (
int, optional) – The range to get values for. If end is not provided, it defaults to the length of the chromosome. If start is not provided, it defaults to the beginning of the chromosome.
- Returns:
An iterator of tuples in the format (start: int, end: int, summary: dict).
- Return type:
Iterator[tuple[int,int,dict]]
Notes
The summary dictionary contains the following keys
total_items: The number of items in the interval.bases_covered: The number of bases covered by the interval.min_val: The minimum value in the interval.max_val: The maximum value in the interval.sum: The sum of all values in the interval.sum_squares: The sum of the squares of all values in the interval.
For BigWigs, the summary statistics are derived from the unique signal values associated with each base in the interval.
For BigBeds, the summary statistics instead are derived from the number of BED intervals overlapping each base in the interval.
- zooms()¶
Return a list of sizes in bases of the summary intervals used in each of the zoom levels (i.e. reduction levels) of the BBI file.
Writing¶
- class BigWigWriter¶
Interface for writing to a BigWig file.
- close()¶
Close the file.
No other operations will be allowed after it is closed. This is done automatically after write is performed.
- write(chroms, vals)¶
Write values to the BigWig file.
The underlying file will be closed automatically when the function completes (and no other operations will be able to be performed).
- Parameters:
Notes
The underlying file will be closed automatically when the function completes, and no other operations will be able to be performed.
- class BigBedWriter¶
Interface for writing to a BigBed file.
- close()¶
Close the file.
No other operations will be allowed after it is closed. This is done automatically after write is performed.
- write(chroms, vals, autosql=None)¶
Write values to the BigBed file.
The underlying file will be closed automatically when the function completes (and no other operations will be able to be performed).
- Parameters:
chroms (
Dict[str,int]) – A dictionary with keys as chromosome names and values as their length.vals (
Iterable[tuple[str,int,int,str]]) – An iterable with values that represents each value to write in the format (chromosome, start, end, rest). Thereststring should consist of tab-delimited fields.
Notes
The underlying file will be closed automatically when the function completes, and no other operations will be able to be performed.
Iterators¶
- class BigWigIntervalIterator¶
An iterator for intervals in a bigWig.
It returns only values that exist in the bigWig, skipping any missing intervals.
- class BigBedEntriesIterator¶
An iterator for the entries in a bigBed.
Summary statistics¶
Exceptions¶
- exception BBIFileClosed¶
BBI File is closed.
- exception BBIReadError¶
Error reading BBI file.