Eugene Yanayt
PulseAugur coverage of Eugene Yanayt — every cluster mentioning Eugene Yanayt across labs, papers, and developer communities, ranked by signal.
-
Eugene Yan details migrating site comments to GitHub Issues via Utterances
Eugene Yan details a process for migrating website comments to Utterances, a system that uses GitHub issues to manage comments. The migration involved creating a dedicated repository, configuring Utterances, and using t…
-
Developer asks if ML is needed for 99% accurate PDF data extraction
A developer inquired about using machine learning to improve PDF data extraction, specifically for handling misspellings and typos in quote numbers that cause extraction failures. The author advised against using ML, su…
-
Eugene Yan details his unconventional path to data science leadership
Eugene Yan, a data science professional, shared insights into his career journey, starting from a psychology background and transitioning into data science roles at companies like IBM, Lazada, and Amazon. He highlighted…
-
Study compares BERT and T5 for NER; article touts paper reading for data scientists
A new arXiv paper details a study comparing BERT and T5 models for Named Entity Recognition (NER), analyzing their performance with different tag schemes and hyperparameters. The research aims to provide insights into c…
-
Senior Data Scientist shares advice on handling imposter syndrome
Eugene Yan, a Senior Data Scientist, responded to a reader named J who expressed imposter syndrome after receiving a promotion to a senior role. Yan advised J that the expectations for a senior position involve being a …
-
Expert beginners risk stagnation by mistaking narrow success for true expertise
Eugene Yan's article discusses the concept of the "expert beginner," an individual who achieves a degree of success in a narrow domain but fails to recognize the broader context and the need for continuous learning. Thi…
-
Unpopular Opinion: Data Scientists Should be More End-to-End
Eugene Yan argues that data scientists should adopt a more end-to-end approach to their work, encompassing problem framing, data engineering, model development, and deployment. He contends that specialization leads to c…
-
Eugene Yan builds web apps with FastHTML, Next.js, and SvelteKit
Eugene Yan details his experience building a web application using various modern frameworks, including FastHTML, Next.js, and SvelteKit. He compares their developer experiences by implementing the same data manipulatio…
-
What I Did Not Learn About Writing In School
Eugene Yan shares insights on improving non-fiction writing, emphasizing that effective writing involves extensive preparation rather than just the act of writing itself. He highlights that much of the work occurs befor…
-
Eugene Yan details FastAPI, Jinja, and HTML form integration for web apps
Eugene Yan has published a guide detailing how to create HTML applications using FastAPI, Jinja, and HTML forms. The article addresses a gap in existing documentation by explaining how to serve HTML content with FastAPI…
-
Data scientists must document projects for reproducibility and knowledge sharing
Data science projects often suffer from poor version control and reproducibility issues, particularly when using Jupyter notebooks with tools like Git. The inclusion of cell outputs in notebooks, while useful for sharin…
-
Eugene Yan automates GitHub profile README updates with Python and Actions
Eugene Yan details a method for automatically updating a GitHub profile README with recent blog posts. The process involves using Python's feedparser library to fetch entries from an Atom feed and then updating specific…
-
Eugene Yan explores the '85% Rule' for optimal performance and well-being
Eugene Yan's article discusses the "85% Rule," a concept popularized by Hugh Jackman and Tim Ferriss, which suggests that exerting 100% effort can sometimes lead to diminished returns compared to operating at 85% capaci…
-
Spark+AI Summit 2020: Notes cover feature engineering, data quality, and model efficiency
Eugene Yan's notes from the Spark+AI Summit 2020 cover practical applications and agnostic talks in deep learning and data engineering. Application-specific sessions highlighted frameworks like Airbnb's Zipline for feat…
-
Eugene Yan addresses data science business integration and model development questions
Eugene Yan's latest post addresses common questions about the practical application of data science within business contexts. He clarifies that business requirements and desired outcomes are established early in project…
-
How to Set Up a Python Project For Automation and Collaboration
Eugene Yan's article outlines a robust Python project setup for enhanced automation and collaboration. The approach focuses on integrating automated checks like unit tests, type-checking, and linting, which can be trigg…
-
Eugene Yan explains Airflow's scheduling delay for ETL jobs
Eugene Yan's article clarifies a common point of confusion regarding Airflow job scheduling, explaining that Airflow jobs are designed to run one schedule interval *after* the scheduled period has ended. Unlike cron job…
-
Eugene Yan shares data science project success strategies: planning, execution, and communication
Eugene Yan outlines best practices for executing data science projects, emphasizing the importance of a clear plan and effective communication. He suggests starting with a literature review to build upon existing resear…
-
Eugene Yan finds value in Scrum for data science projects
Eugene Yan shares his evolving perspective on using Scrum methodologies within data science projects. Initially resistant to its structured approach, particularly regarding estimation and the potential for iterative rew…
-
Crocker's Law: Embrace feedback as a gift for growth, not offense
Eugene Yan's article explores Crocker's Law, a principle advocating for focusing on improving content rather than reacting emotionally to feedback. This concept, exemplified by Wikipedia editor Crocker and Shopify CEO T…