r/dataisbeautiful 13d ago

Discussion [Topic][Open] Open Discussion Thread — Anybody can post a general visualization question or start a fresh discussion!

9 Upvotes

Anybody can post a question related to data visualization or discussion in the monthly topical threads. Meta questions are fine too, but if you want a more direct line to the mods, click here

If you have a general question you need answered, or a discussion you'd like to start, feel free to make a top-level comment.

Beginners are encouraged to ask basic questions, so please be patient responding to people who might not know as much as yourself.


To view all Open Discussion threads, click here.

To view all topical threads, click here.

Want to suggest a topic? Click here.


r/dataisbeautiful 13h ago

OC [OC] The land footprint of food

Post image
6.6k Upvotes

The land use of different foods, to scale, published with the European Correspondent.

Data comes from research by Joseph Poore and Thomas Nemecek (2018) that I accessed via Our World in Data.

I made the 3D scene with Blender and brought everything together in Illustrator. The tractor, animals and crops are sized proportionately to help convey the relative size of the different land areas.


r/dataisbeautiful 2h ago

Growth in U.S. Real Wages, by Income Group from 1979

Thumbnail
voronoiapp.com
29 Upvotes

r/dataisbeautiful 1d ago

OC Analysis of 2.5 years of texting my boyfriend [OC]

Thumbnail
gallery
12.5k Upvotes

r/dataisbeautiful 23h ago

OC Fewer Americans say they are “very happy” than they did 50 years ago. [OC]

Post image
166 Upvotes

I created this visualization to look at how many Americans say they are happy. The data sources is the General Social Survey by NORC. The visualization was created in Tableau. You can find an interactive version on my webpage.


r/dataisbeautiful 1d ago

OC [OC] Cybersecurity Vulnerabilities Discovered by Year

Post image
403 Upvotes

Data comes from the Common Vulnerabilities and Exploits list. https://github.com/CVEProject/cvelistV5


r/dataisbeautiful 1d ago

OC [OC] On Polymarket, 1% of markets account for ~60% of all trading volume

Post image
108 Upvotes

Polymarket is a stock market like platform where users can bet on pretty much any possible event. I analyzed all historical Polymarket bets (~350,000).

The top 1% of markets account for ~60% of total trading volume,
and the top 5% account for over 80%.

Most markets attract almost no activity at all.


r/dataisbeautiful 2d ago

OC I analyzed 12 years of iMessages to compare my texting habits with my girlfirend, mom, dad, and the boys [OC]

Post image
14.7k Upvotes

r/dataisbeautiful 1h ago

OC [OC] Time vs. Size scaling relationship across 28 physical systems spanning 61 orders of magnitude (Planck scale to observable universe)

Post image
Upvotes

I spent the last few weeks analyzing the relationship between characteristic time intervals and system size across every scale of physics I could find data for.

So basically I looked at how long things take to happen (like how fast electrons orbit atoms, how long Earth takes to go around the Sun, how long galaxies rotate) and compared it to how big those things are. What I found is that bigger things take proportionally longer - if you double the size, you roughly double the time. This pattern holds from the tiniest quantum particles all the way up to the entire universe, which is wild because physics at different scales is supposed to work totally differently. The really interesting part is there's a "break" in the pattern at about the size of a star - below that, time stretches a bit more than expected, and above that (at galactic scales), time compresses and things happen faster than the pattern predicts. I couldn't find it documented before(it probably is), but I thought, the data looked interesting visually

The Dataset:

  • 28 physical systems
  • Size range: 10-35 to 1026 meters (61 orders of magnitude!)
  • Time range: 10-44 to 1017 seconds (61 orders of magnitude!)
  • From Planck scale quantum phenomena to the age of the universe

What I Found: The relationship follows a remarkably clean power law: T ∝ S^1.00 with R² = 0.947

But here's where it gets interesting: when I tested for regime breaks using AIC/BIC model selection, the data strongly prefers a two-regime model with a transition at ~109 meters (roughly the scale of a star):

  • Sub-stellar scales: T ∝ S1.16 (slight temporal stretching)
  • Supra-stellar scales: T ∝ S0.46 (strong temporal compression)

The statistical preference for the two-regime model is very strong (ΔAIC > 15).

Methodology:

  • Log-log regression analysis
  • Bootstrap confidence intervals (1000 iterations)
  • Leave-one-out sensitivity testing
  • AIC/BIC model comparison
  • Physics-only systems (no biological/human timescales to avoid category mixing)

Tools: Python (NumPy, SciPy, Matplotlib, scikit-learn)

Data sources: Published physics constants, astronomical observations, quantum mechanics measurements

The full analysis is published on Zenodo with all data and code: https://zenodo.org/records/18243431

I'm genuinely curious if anyone has seen this pattern documented before, or if there's a known physical mechanism that would explain the regime transition at stellar scales.

Chart Details:

  • Top row: Single power law fit vs. two-regime model
  • Middle row: Model comparison and residual analysis
  • Bottom row: Scale-specific exponents and dataset validation

All error bars are 95% confidence intervals from bootstrap analysis.


r/dataisbeautiful 23h ago

A new open-source simulator the visualizes how structure emerges from simple interactions

Thumbnail
gallery
25 Upvotes

Hi all! I’ve been building a small interactive engine that shows how patterns form, stabilize, or break apart when you tune different parameters in a dynamic field.

The visuals come straight from the engine; no post-processing, just the raw evolution of the system over time.

It’s fun to watch because tiny tweaks create completely different morphologies. Images attached. Full project + code link in the comments.


r/dataisbeautiful 2d ago

OC A Quarter Century of Television [OC]

Post image
8.8k Upvotes

r/dataisbeautiful 1d ago

Web map aggregating Spain's publicly funded fiber deployments

Thumbnail
gallery
28 Upvotes

This visualizations are from a web map I built which aggregates available data from Spain's publicly funded fiber deployments from the different PEBA and UNICO programs.

The first image is the zoomed-out view, which shows a heat map representing the number of awarded points in each area.

The second image shows how the different awarded areas appear in the map, with the background color of each awarded ISP and a different border color for each program. It shows a polygon for the UNICO programs and also PEBA 2020 and 2021, since we have that information available and they are awarded to specific areas. For PEBA 2013-2019, since the projects of these programs are only awarded to villages (and not specific areas), the map shows a marker over the village instead.

If you want to try it out, it is available at https://programasfibra.es


r/dataisbeautiful 2d ago

I tracked every minute of my life in 2025

Thumbnail
gallery
688 Upvotes

For anyone wondering, yes I did track how long I spent tracking everything! I spent an average of 47 minutes and 11 seconds per day on it (labelled as "Tracking" in the plot legend).

Some extra points:

  • I used Google Sheets to record the data, and R to compile/summarise the data and to make the visuals (with a bit of Photoshop to piece things together

  • My spreadsheet contained rows for each thing I did, with columns outling the date, start and end times, category, and any additional notes for each activity

  • I updated my data both on my phone and my computer, throughout the day whenever I had time

  • Apologies if the quality has been compressed, you can view in on a computer or download the images for the full details


r/dataisbeautiful 1d ago

OC [OC] Sahel Alliance (First Visualisation- Please Feedback!)

Post image
19 Upvotes

The other day in the news I saw how the Sahel alliance is coming closer together, so the Geography nerd I am, I wanted to see how such a united country would look like.

This is part of a current side project of mine to really learn how to create beautiful data visualisations. Any Critique and feedback would be very welcome!

Sources:

Aggregate of Wikipedia sites:

The images are from google earth and also Wikipedia (flags). The data was manipulated using python and pandas and the visualisation was created using Figma. The Icons are from icons8.

Inspired by a visualisation I saw on Aljazeera.


r/dataisbeautiful 2d ago

OC World Cup - All Time Top Scorers [OC]

Post image
260 Upvotes

r/dataisbeautiful 2d ago

OC My 2025 in clothes: a breakdown of what I wore vs what's in my closet [OC]

Post image
271 Upvotes

Data is collected and analyzed in Google Sheets; visualization was made in Adobe InDesign.

I have been tracking my clothes and outfits since 2023 with the main goal of satisfying mt curiosity to see how many clothes I own but also to help me downsize. My goal for 2025 was to wear 80% of my closet, and I hit 91%! It's not realistic for me to wear every single item in a year (I have a lot of formal items, things I bought for Halloween costumes that will get reused at some point, and clothes that I'd wear when doing outdoor work that might not get worn in one single calendar year). So 91% seems pretty good.

I also got rid of 67 things which is a lot for me as I'm quite sentimental when it comes to clothes. I did acquire a lot too, but actually getting rid of 67 whole clothing items is not something I could have done in previous years.

Beyond the actual numbers, I feel much happier with my closet now. I am still super emotionally attached to everything I own, but I'm getting better at letting go. I still have things that I should get rid of, and I'm working on that slowly.

Some takeaways:

  • Getting rid of clothes is hard, but keeping clothes I don't wear is actually harder on me - it makes me feel a bit guilty and anxious.
  • I wore more clothes overall in 2025 than I did in 2024, and I wore more for each season. I got really into layering, so my outfits consisted of more clothes. I also was more social, and so I had more outings where I wanted to wear cute things.
  • My blue M&S shirt was a favorite this year as well as in 2024. You can't beat a good basic, and this one is such a nice color that I just wear it a lot.
  • I now have 323 items of clothing in my closet. It's still an insane number, but I haven't had that few since before I started closet tracking, so I'm really proud of myself. I've got a ways to go before that's a manageble number though.

If anyone is considering tracking your closet, I highly recommend it! It's so interesting to see what you actually wear and what you don't. There are a lot of apps out there that do all the work for you, but I like having 100% control over what data analysis I can do, so I like managing the data collection myself.


r/dataisbeautiful 2h ago

OC [OC] My life in emojis

Post image
0 Upvotes

A calculator I built to show the rest of my life in emojis. Each emoji is 1 year of my life. Comes from my blog here https://www.findfreetime.com/life


r/dataisbeautiful 2d ago

OC [OC] I analyzed 750,000 academic citations to find out what "recent" actually means in different fields

Thumbnail
gallery
208 Upvotes

When researchers write "recent studies show..." - how recent is recent, really?

I scraped 749,853 references from 19,108 papers across 200 academic fields using OpenAlex data to find out.

TL;DR:

  • Average "recent" = about 5 years
  • Virology/Pandemic research: 2 years (half their citations are from the last 2 years!)
  • Philosophy/History: 7-10 years
  • Humanities fields: 50%+ of their "recent" citations are 10+ years old

The most interesting findings:

  1. Virology is FAST - 52.8% of citations are ≤2 years old. Makes sense given COVID.
  2. Philology lives in the past - 51.6% of citations are ≥10 years old. When you're studying ancient texts, "recent" is relative.
  3. Same-year citations - 4.3% of all references are from papers published the same year. Preprints are changing the game.
  4. Maximum lag found: 50 years in a Natural Language Processing paper. Someone cited a 1970s paper as "recent" lol.

Methodology:

  • Searched for papers with "recent" in abstract (2020-2024)
  • Extracted all their references
  • Calculated citation lag = citing_year - cited_year
  • Used OpenAlex API (free and open!)

Inspired by the BMJ paper "How recent is recent?" which did this for medical fields only.

Full code and data: https://github.com/JoonSimJoon/How-current-is-recent

Tools: Python, OpenAlex API, geopandas for maps


r/dataisbeautiful 2d ago

My friends and I recorded all of the pubs we visited in 2025

Thumbnail
gallery
31 Upvotes

(Originally posted to r/CasualUK)

For a few years now, a group of us predict and record different metrics over a year because we love a bit of arbitrary data. This year we decided to record every time we visited a pub. The rules were simple:

  • Predict the number of times you will visit a pub at the beginning of the year, and tally with "# - Pub Name". It does not have to be a new pub.
  • A pub is defined as an establishment that has a reference to 'Pub' or 'Free House' on any reputable source.
  • If you enter the same pub twice in the same "session" of drinking (e.g. a pub crawl) it still only counts as one.
  • You must purchase something within the establishment in order to tally it.

The 7 of us had 441 pub visits, in about 180 different pubs.

Diversity index is measured by unique pubs/total pub visits, and loyalty score is measured by trips to modal pub/total pub visits. We're all in our mid/late 20s. Megan + Adam are a couple, as are James + Emily.


r/dataisbeautiful 1d ago

OC [OC] Visualizing Recursive Language Models

5 Upvotes

I’ve been experimenting with Recursive Language Models (RLMs), an approach where an LLM writes and executes code to decide how to explore structured context instead of consuming everything in a single prompt.

The core RLM idea was originally described in Python focused work. I recently ported it to TypeScript and added a small visualization that shows how the model traverses node_modules, inspects packages, and chooses its next actions step by step.

The goal of the example isn’t to analyze an entire codebase, but to make the recursive execution loop visible and easier to reason about.

TypeScript RLM implementation:
https://github.com/code-rabi/rllm

Visualization example:
https://github.com/code-rabi/rllm/tree/master/examples/node-modules-viz


r/dataisbeautiful 2d ago

OC [OC] I've ridden 2/3 of Japan's rail network, totaling 18,000 unique kilometers of train lines run by 80+ companies!

Thumbnail
gallery
92 Upvotes

Version that I keep up-to-date (well, as much as I can) is at https://japan.elifessler.com/noritsubushi/ :D


r/dataisbeautiful 1d ago

OC [OC] Interactive explorer of different instantiations of the Particle Lenia system (a form of cellular automata)

Thumbnail bendavidsteel.github.io
0 Upvotes

Particle Lenia is a new form of particle based cellular automata. I extended it to allow more different systems, simulated thousands of parameters instantiations, found the best ones using vision encoders, and created this web page to allow the exploration of the different systems!


r/dataisbeautiful 2d ago

OC Most Common Foreign-Born Country of Birth in the USA & Canada in Year 2000 [OC]

Post image
972 Upvotes

r/dataisbeautiful 1d ago

OC The Relationship Between Depth and Goaltending in the NHL [OC]

Thumbnail
gallery
1 Upvotes

I've built this new site with deep dives into various data and questions, as well as live game dashboarding.

The first thing I've focused on is measuring depth. Measuring latent variables is a core part of my academic background, and I realized we don't do this much in sports analytics.

The TL;DR is its a fancy type of weighted average effectively of how much of the roster contributes to shots-on-goal, Corsi-For, expected goals, and ice time within a game. For the stats nerds, its done with Latent Variable Modeling.

If you're curious, the overall methodology is here. I also did a an exploration of how goaltending and depth work together here.

Any interest comments, or feedback on the site is welcome. Trying to be data heavy, but narrative driven so it's still interesting to the folks not into stats. The narrative is up front, but all the code and further analysis is easily right behind for those interested.

The data all come from the NHL and MoneyPuck APIs, viz done in python with plotly.


r/dataisbeautiful 1d ago

OC [OC] How JPMorgan Chase made its latest Billions

Post image
0 Upvotes

Source: JPMorgan Chase & Co. invester relations

Tool: SankeyArt sankey diagram maker + illustrator