this post was submitted on 01 Nov 2024
516 points (97.4% liked)

Data is Beautiful

4828 readers
21 users here now

A place to share and discuss visual representations of data: Graphs, charts, maps, etc.

DataIsBeautiful is for visualizations that effectively convey information. Aesthetics are an important part of information visualization, but pretty pictures are not the sole aim of this subreddit.

A place to share and discuss visual representations of data: Graphs, charts, maps, etc.

  A post must be (or contain) a qualifying data visualization.

  Directly link to the original source article of the visualization
    Original source article doesn't mean the original source image. Link to the full page of the source article as a link-type submission.
    If you made the visualization yourself, tag it as [OC]

  [OC] posts must state the data source(s) and tool(s) used in the first top-level comment on their submission.

  DO NOT claim "[OC]" for diagrams that are not yours.

  All diagrams must have at least one computer generated element.

  No reposts of popular posts within 1 month.

  Post titles must describe the data plainly without using sensationalized headlines. Clickbait posts will be removed.

  Posts involving American Politics, or contentious topics in American media, are permissible only on Thursdays (ET).

  Posts involving Personal Data are permissible only on Mondays (ET).

Please read through our FAQ if you are new to posting on DataIsBeautiful. Commenting Rules

Don't be intentionally rude, ever.

Comments should be constructive and related to the visual presented. Special attention is given to root-level comments.

Short comments and low effort replies are automatically removed.

Hate Speech and dogwhistling are not tolerated and will result in an immediate ban.

Personal attacks and rabble-rousing will be removed.

Moderators reserve discretion when issuing bans for inappropriate comments. Bans are also subject to you forfeiting all of your comments in this community.

Originally r/DataisBeautiful

founded 1 year ago
MODERATORS
 
(page 2) 50 comments
sorted by: hot top controversial new old
[–] [email protected] 12 points 4 days ago* (last edited 4 days ago) (1 children)

I would imagine this is because there is a 'comfortable' rate of information exchange in human conversation, and so each given language will be spoken at a pace that achieves this comfortable rate.

So it's not that the syllable rate coincidentally results in the same information rate, but the opposite - the syllable rate adjusts to match the desired information rate.

[–] [email protected] 6 points 4 days ago (1 children)

Interesting thought.

I'd add it's probably also that 90%+ of conversation isn't about "data transfer" in the technical sense, but relationship building. So information volume isn't usually crucial.

Now let's see this work done in technical fields, especially change management, maintenance, emergency services, etc, where time is crucial. Those environments tend to have very "coded" language, so we don't have to say a paragraph whenever we call for a very specific function/tool/action.

I suspect the languages would still have similar curves, but the data rates would increase.

load more comments (1 replies)
[–] [email protected] 9 points 4 days ago* (last edited 3 days ago)

I’d like a visual of how much unnecessary elaboration different languages commonly use to make a point.
Though you can elaborate excessively for fun, how much is common?
And on the other end of the scale text speak is often extremely concise (not me tho ha). Would be cool to see and compare the limits.

[–] [email protected] 2 points 3 days ago (2 children)

I am curious about Arabic. I feel like it should be having the highest information rate.

load more comments (2 replies)
[–] [email protected] 8 points 4 days ago* (last edited 4 days ago) (1 children)

Turkish seems inefficient. You spend the effort to talk quickly but don't get the reward of high info transfer speed like Spanish.

[–] [email protected] 6 points 4 days ago

The words are very modular and systematic, but you seemingly pay a price for it.

[–] [email protected] 6 points 4 days ago

Wollen sie etwa behaupten, die Informationsübermittlungsgeschwindigkeit der deutschen Sprache sei unterdurchschnittlich? So eine Unverschämtheit!

[–] Apytele 5 points 4 days ago* (last edited 4 days ago) (2 children)

English is pictured as such a smooth, almost perfectly normalized bell curve. On one hand it's such a versatile language that (largely due to colonialism) has undergone so much evolution and mixing with other languages that I can believe that. On the other hand it looks almost too normal. Odd.

[–] [email protected] 9 points 4 days ago (1 children)

On the other hand it looks almost too normal. Odd.

It could indicate bias on the part of the researchers. I haven't read their methodology, but in my amateur study of languages, some languages have some interesting tricks for communication that don't translate to English well or efficiently. If English was used as the baseline, then the study ma not incorporate some of the neat things other languages can do as points to measure.

Mandarin has a word particle to communicate "completed action". This is used instead of conjugating verbs for tenses. Example: in English you might say:


"I went to the shop" 5 syllables


In Mandarin the literal translation back to English would be:

"I go to the shop [completed action]" 5 syllables

For the two measures listed of essentially Information Density and Speech Velocity, this benefit wouldn't show up, but if you're measure for something like Encoding and Decoding Burden (I'm making up these terms), then Mandarin could rank higher.

[–] [email protected] 4 points 4 days ago

Looking up the article the baseline is French and English I'd say. So it might be biased, but I didn't read the article and even if I did, I'm a chemical engineer so what do I know of this field.

[–] [email protected] 5 points 4 days ago

Could be bias. But, I wonder if it could be because English has borrowed so much from other languages.

It’s also interesting that English and French look so similar in the graphs. Both, have been the de facto international language for a long time.

[–] [email protected] 5 points 4 days ago (2 children)
[–] ironhydroxide 9 points 4 days ago (4 children)

Opposite. Look at the notes at the top of the graphs

[–] [email protected] 6 points 4 days ago

Not as efficient as others in bits per second, but interestingly the syllable-to-bits ratio is tightly coupled.

load more comments (3 replies)
[–] [email protected] 5 points 4 days ago* (last edited 4 days ago) (1 children)

Hard to tell. Need something like "bits of information per syllable" to get at efficiency. Just eyeballing it, Vietnamese, English, and Cantonese seem most likely the most efficient.

[–] [email protected] 3 points 4 days ago

Cantonese and Vietnamese make sense, as they're are both tonal languages (along with Mandarin, Thai, Punjabi, and Cherokee apparently). English wastes tones on communicating stress or question vs statement.

[–] captain_aggravated 3 points 4 days ago

That was the issue I had with my elementary school spanish teacher. He spoke so fast that you just couldn't latch onto anything. It just sounded like DDDDDDDDDDDDDDDDS aqui. DDDDDDDDDDDDDDDDRS agostos.

load more comments
view more: ‹ prev next ›