dim_reduction module

dim_reduction.comparison_dim_reduction(X, target, outpath)[source]

Main code from scikit-learn to create comparative plots with different reduction methods

Parameters
  • X – data

  • target – list

  • outpath – to store image .png

dim_reduction.plotly_comparison(data, pcolor_list, ccolor_list, annos, outpath, title_dataset)[source]

Plot all dimensionality reduction techniques into one plotly plot as HTML and PDF

Parameters
  • data – numpy array

  • pcolor_list – list of project colors

  • ccolor_list – list of condition colors

  • annos – annotation dataframe: FileID, CaseID, SampleType, Project

  • outpath – for images

  • title_dataset – for output path

Returns

None

dim_reduction.read_metadata(metadata, sample_ids)[source]

Read in metadata file csv format FileID, CaseID, SampleType, Project

Returns

meta_dict

Returns

target_names: dict

Returns

target: array [0,1,0,0,1,1,1,0…]

Returns

annotation: pd.DataFrame {ID, Condition, CaseID, Project}

Returns

project_arr: numerical encoding of projects [0,1,1,2,0,3,…]

Returns

color_list: color list for conditions

Returns

pcolor_list: project color list

dim_reduction.scaling(data)[source]

Scaling with MinMax to range (-1,1)

Parameters

data – numpy array

Returns

scaled_data

dim_reduction.silhouette_plot(data, target)[source]

Determine good number of clusters

Parameters
  • data – numpy array

  • target – list

dim_reduction.visualization_plots(data, target, outpath, method=<class 'str'>, title_dataset=<class 'str'>)[source]

Creates interactive plotly graphs with reduced dimensions with tooltip displaying metadata

Parameters
  • data – numpy array

  • target – list

  • outpath – to folder for images .png, .html

  • method – one of [pca, tsne, umap]

  • title_dataset – i.e. Pancreas ComBat corrected