Hi there!

This is my technical blog where I write about my work in data science and statistics.

Relevance-aware entity extraction for financial news

In this post, I will detail how I tackled the problem of building a context and relevance-aware NER system for financial news entity detection. The problem We are tasked to build a model to extract company names from financial news, but only if the companies are directly relevant to the news article. For instance, consider a random financial news article about weight-loss drug companies like Novo Nordisk: While the article also mentions big tech companies like Nvidia and Apple, they are not relevant to the story, and ideally should not be extracted....

Automatic crossword generation using LLM Agents

In this post, I will detail how I created a NLP-based automatic crossword puzzle generator, which takes in any topic as user input, and generates answer-clue pairs relating to that topic along with a crossword board automatically. This is based on the AgentCoder approach. Demo My source code can be found here. User inputs topic: ‘sports’ python main.py sports Crossword is generated, with board, answers, and clues: User Input Topic: SPORTS 8 out of 14 words generated used F - G A M E - - - - O - - T - - - - - - O - - H O C K E Y - T - - L - O - - - - B A S E B A L L - - A - - T - C - - - - L - - I - H O S T - L - - C - - - - V - - - - - - - - - - - - - - - - - - - - - ACROSS: (1, 3) - A contest of risks (4) (3, 4) - Sport with sticks and pucks (6) (5, 1) - America's pastime, batting around (8) (7, 6) - One who greets at the door (4) DOWN: (1, 1) - Sport involving goals with kicks (8) (1, 4) - Fit for sports, sounds like a competition (8) (3, 6) - Mentor of teams (5) (7, 9) - Small screen box (2) Motivation While working at Julius Baer, we worked on replicating the results and implementing the framework in AgentCoder, which is the current state-of-the-art on HumanEval and MBPP....

Reverse-mode autodiff from scratch

We implement a simple automatic differentiation tool in Python which can compute the gradient of any (simple) multivariable function efficiently. Use case Understanding how autodiff works is crucial for understanding backpropagation and how optimisation works in a deep learning setting: In general, we want an easy way to compute gradients of a loss function wrt to its weights and bias parameters so that we can apply algorithms such as gradient descent....