Annie Lyu

Annie Lyu

Senior Data Scientist



I am a Senior Data Scientist working on data mining for business intelligence at Autodesk. My PhD advisors are Dr. Heike Hofmann and Dr. Emily Berg. Currently, I am co-leading R Ladies Ames with Anabelle Laurent. For pleasure, I enjoy hiking with my dog Peony, karaoke, and cross-stitch.


  • Data Visualization
  • Statistical Learning


  • PhD in Statistics, 2020

    Iowa State University

  • BSc in Statistics, 2015

    Wuhan University



Senior Data Scientist, Autodesk Construction Solutions


Aug 2020 – Present Greater Toronto Area, Canada

Graduate Research Assistant, Center for Survey Statistics and Methodology

Iowa State University

Aug 2015 – Jun 2020 Ames, Iowa

Recent Posts

VISCOVER is now featured in the RStudio Shiny Gallery!

I am so happy to share the news that my R Shiny app viscover is now featured in the RStudio Shiny Gallery. 🎉 🎉 🎉 It's categorized into the Public Sector section in the Gallery due to its application to the soil survey data and cropland data layer maintained by the USDA. viscover takes its name from VIsulizing Soil and Crop data and their OVERlay. Although my motivation to develop

Fun blogdown in R to design a personal website

Inspired by David Robinson‘s keynote talk at the RStudio conference 2019 (summary in the following tweet), I decided to write a post about how I use Yihui‘s fantastic R package blogdown to develop my own personal website. When you’ve written the same code 3 times, write a function When you’ve given the same in-person advice 3 times, write a blog post — David Robinson (@drob) November 9, 2017 Well, there are a lot of useful references to check out.

Fun Leaflet in R with NYC Squirrel Census Data

Anabelle introduced the NYC Squirrel Census data to me today. It also comes from the recommended dataset of #TidyTusday. Both of us adore squirrels 😍. The dataset contains variables which can tell when (Date) and where (longitude X and latitude Y) people saw a squirrel of certain age (Age) and fur color (Primary Fur Color) conducting some activities (Running, Climbing, Chasing, Eating, Foraging or Other Activities). Having some experience working with leaflet in R, I find it very interesting to visualize this data set on an interactive map.


个人网站上的第一篇中文博客献给值得纪念的2019年暑假, 响应谢老大的号召,我网故我在。 一个人的七天欧洲之旅 因为参加一个国际会议,我短暂拜访了

Showcase my home-made dessert

🍰 🍰 🍰 dessert


viscover: visualize soil and crop data and their overlay

Interact with USDA-NASS Cropland Data Layer and USDA-NRCS Soil Survey Geographical Data. Featured in the RStudio Shiny Gallery.

iNtr: an interactive NRI table review tool

Sanity check of the data products of the National Resources Inventory program. Developed for the Center for Survey Statistics and Methodology at ISU.

ISOFAST: ISA On-Farm Trial Summarization Tool

Enable farmers to easily navigate exploration data analysis and statistical inference from on-farm trial data. Developed for the Iowa Soybean Association.

Iowa DNR MSIM - SGCN Modeling

Interact with the predictive occupancy map of endangered wild species. Developed for the Iowa Department of Natural Resources.

Systematic Sampling Illustration

Developed for the class of STAT421 (Survey Statistics) at ISU.

Recent & Upcoming Talks

Create a personal website with Blogdown like what we did

A personal website is an incomparable platform for building your online profile and showcasing your amazing work (research or other …

Empirical Bayes small area prediction under a zero-inflated lognormal model with correlated random effects

Many variables of interest in agricultural or economical surveys have skewed distributions and can be contaminated with a …

Applications of R Shiny to Explore, Evaluate and Improve Total Survey Quality

Maintaining and assessing total survey quality on a large scale and complex survey such as the National Resource Inventory (NRI) often …

Progress Report: Visualization of Sheet and Rill Erosion on US Cropland

National Resource Inventory (NRI) is a longitudinal survey which monitors national resources on non-federal US land. It provides annual …

Empirical Bayes Small Area Prediction of Sheet and Rill Erosion Using a Zero-Inflated Lognormal Model

In the Conservation Effects Assessment Project (CEAP), some of the variables are skewed right and have zeros. We proposed an empirical …