Advanced Search
Search Results
156 total results found
Projects & Documentation
Digital Garden
ML and Data Science
Cover photo by Conny Schneider
Software Engineering
Cover image by AltumCode
Learning and Knowledge
Photo by нυвιѕ тανєяη on Unsplash
Microcosm
Microcosm is a tiny lightweight micropub endpoint written in Python to support my website Brainstorm.
🌱 Seed Propagator
AI and ML
Cover photo by Conny Schneider https://unsplash.com/@choys_
Software Engineering Misc
Cover image by AltumCode
Devices and Tech
Cover photo by https://unsplash.com/@we_are_rising
Science of Science
All things science of science including scientometrics, measuring real world impact of scientific work and text mining scientific papers. Photo by Dom Fou on Unsplash: https://unsplash.com/s/photos/lecture?utm_source=unsplash&utm_medium=referral&utm_content...
Node and Typescript
Airflow
Python
Engineering Leadership and CTO
Data Engineering and MLOps
Data Quality and Preparation
Working with LLMs
Benchmarks and Exercises
Google Cloud Platform
API Management
JIRA
PKM
Mental Health
Personal Knowledge Management
Plumbing
odds and sods about plumbing maintenance
Free Open Source Software and Open Culture
FastAPI
LaTeX
LaTeX is a powerful typesetting system that renders books and has specific markup for mathematics
Workflows and Processes
Migrating from Linear to JIRA
Tasks
Tasks within ML and NLP
Machine Learning with Limited Data
Django
Working with the Django web framework and associated libraries
Explainability and Model Analysis
Intro to Microcosm
Warning: This page is very much a work in progress Microcosm is a tiny and lightweight micropub server written in python to support a static site powered by Hugo It works by accepting micropub payloads and storing them in a Git repository that backs a hugo sit...
Pattern Exploitative Training
PET or Pattern Exploitative Training @article{schick2020exploiting, title={Exploiting Cloze Questions for Few-Shot Text Classification and Natural Language Inference}, author={Timo Schick and Hinrich Schütze}, journal={Computing Research Repository}, ...
DVC
DVC or Data Version Control is an open source tool for managing data assets. It is very useful but also can be quite overwhelming to use. The main use cases I've found for DVC are: Keeping large data assets (e.g. machine learning datasets) version controlled...
Climate and Eco
Federated Learning
Flower is a federated learning framework with compatibility with Torch, Tensorflow and others
Knex
ORM for node.js with Typescript compatibility
Model Confidence Scores
Many ML classification models can provide a confidence score which tells the user how confident the model is that it has made the correct choice. The values of these confidence scores and what constitutes a "good" or "bad" score can vary a lot depending on the...
Question Answering
Approaches Fine-Tuning Sentence-BERT for Question Answering CapitalOne produced a tutorial (mirror) about using sentence-transformers for Question Answering. They use SBERT because it is optimised for fast compute on individual sentence and has good general ...
Security
- The OWASP API Top 10 security measures may be a good place to start when trying to decide what security to implement on your web project
Exploratory Data Analysis (EDA)
There are a number of powerful tools like Pandas Profiling and SweetViz that can make EDA fast and repeatable. Pandas Profiling Pandas Profiling is an automated EDA tool that generates rich HTML reports from pandas dataframes. It can be a very nice way to show...
Explainability
Explainability is a big challenge in machine learning. I wrote a blog post about the ELI5 library and how it can be applied to NLP models. Introducing ML customers to explainability early on can be a great way to build trust. A colleague suggests using Streaml...
Home Page
Welcome to the digital garden of James Ravenscroft. This site is where I keep my notes in progress, some of the material is not fully fleshed out and some of the fleshed out work may be a bit rough around the edges. More polished work can be found on my main b...
☕ Coffee
Good UK Coffee Merchants Hayling Island Coffee Society (HICS), Portsmouth A local (to me), independent coffee roaster on Hayling Island just north of Portsmouth, Hampshire Notable Blend: HICS Volcano Island - a strong dark roast similar to Taylors Hot Java ...
NPM and Gitlab
Gitlab has a built in package repository that can be used as a stand in for NPM's global repo. Best practice is to map a scope to your repository in your .npmrc file and in your package's package.json file. Gitlab uses CI tokens to authenticate against the npm...
Publishing Type Definitions
It can be useful to be able to publish these types in custom NPM repos e.g. Gitlab Configuring package.json File Including src and dist folders use files key inpackage.jsonto indicate which directories should be published (https://stackoverflow.com/questions/6...
Music
I have a very ecclectic and general interest in music, from jazz to rap metal and from synthwave to cuban folk music. Personal Stats Whilst I still scrobble to Last.FM, I also maintain my own maloja instance For Coding and Concentrating I'm a massive fan o...
AMQPLib and RabbitMQ
Channel closed by server: 406 (PRECONDITION-FAILED) with message "PRECONDITION_FAILED As explained by this article this implies that your channel has consumed a message without ACKing or NACKing it and it has timed out. Make sure to ACK or NACK all messages wh...
SpaCy GPU
Set Up Environment It's relatively easy to use SpaCy with a GPU these days. First set up your conda environment and install cudatoolkit (use nvidia-smi to match versions of the tookit with the drivers): Run nvidia-smi: Create conda env: conda create -n test p...
SpaCy CoRef
Spacy Coref is an experimental coreference resolution model in spacy The project repository is here. There is currently a hard dependency on the LDC OntoNotes dataset which makes it difficult to use without spending money. Hopefully they will release a pre-tra...
Digital Gardens, Zettelkasten and Second Brains
Chris Aldrich has some interesting thoughts on these topics here Someone is maintaining a really cool collection of second brain stuff here