News-Tweets NLP Linker

nlp
distributed-computing
python
data-engineering
Link tweets with news articles using NLP text similarity, with distributed computing pipeline and interactive dashboard
Published

July 16, 2021

Overview

Links tweets with news articles based on text content similarity using NLP techniques. Uses distributed computing (Mesos + Docker) to run the similarity score pipeline in parallel for speedup.

Interactive dashboard for exploring news-tweet similarity

Components:

  • Similarity pipeline — NLP-based text similarity scoring between tweets and news articles
  • Distributed computing — Mesos and Docker pipeline for parallel processing
  • Interactive dashboard — Dash app to view and interact with similarity results

Distributed computing architecture using Mesos and Docker

Code

GitHub Repository