Recently Updated Pages
Core Scientific Concepts (CoreSC)
Core Scientific Concepts (CoreSC) is an annotation scheme used to delineate different parts of sc...
Large Scale Multi-Label Learning
The Keras website has a tutorial on how to do multi-label learning with a large number of labels:...
SpaCy CoRef
Spacy Coref is an experimental coreference resolution model in spacy The project repository is he...
Federated Learning
Flower is a federated learning framework with compatibility with Torch, Tensorflow and others
SpaCy GPU
Set Up Environment It's relatively easy to use SpaCy with a GPU these days. First set up your con...
ML Best Practices
Machine learning is a complex and multifaceted activity that requires the combination of a number...
Explainability
Explainability is a big challenge in machine learning. I wrote a blog post about the ELI5 library...
Model Confidence Scores
Many ML classification models can provide a confidence score which tells the user how confident t...
Learning with Limited Data
Good machine learning is heavily dependent on good data. A few more good data-points is likely to...
Relationship Extraction
Relationship Extraction (RE) is a task that is related to Coreference Resolution but with a focus...
Pattern Exploitative Training
PET or Pattern Exploitative Training @article{schick2020exploiting, title={Exploiting Cloze Qu...
Music
I have a very ecclectic and general interest in music, from jazz to rap metal and from synthwave ...
Design Frameworks
Design frameworks provide out of the box styling and components for use in websites. Many framewo...
Mental Health Primer
May I have the serenity to accept the things I cannot change,the courage to change the things I...
Exporting Issues from Linear
The first step we need to take is to export our issues from linear. The easiest way to do this is...
Deploying Django Apps
Packaging a Django App in Docker I wrote a blog about packaging django apps up for shipping in d...
Django and PostgreSQL
When working with Django and PostgreSQL it is typically best to use the psycopg[binary] package: ...
DBT
DBT is a data transformation tool with a SaaS platform and an open-core command line tool. The to...
Data Wrangling
DuckDB DuckDB is a lightweight OLAP type database system written in C++ and designed to be used f...
Data Loading with Airbyte
Airbyte is a FOSS tool for mass data import and export when working with common flavours of SQL a...