Connect with us

Tech

Zipf’s Law

Published

on

Zipf’s Law

21/08/2008



In our recent Plus article Tasty maths, we introduced Zipf’s law. Zipf’s law arose out of an analysis of language by linguist George Kingsley Zipf, who theorised that given a large body of language (that is, a
long book — or every word uttered by Plus employees during the day), the frequency of each word is close to inversely proportional to its rank in the frequency table. That is:

$ P_ n propto 1/n^ a $

where a is close to 1. This is known as a “power law” and suggests that the most frequent word will occur approximately twice as often as the second most frequent word, which occurs twice as often as the fourth most frequent word, etc. A famous study of the Brown Corpus found that its words accorded to Zipf’s law quite well,
with “the” being the most frequently occurring word (accounting for nearly 7% of all word occurrences — 69,971 out of slightly over 1 million), and “of” the second most frequent (3.5% of all words).

Plus Zipf.

The frequency of words on Plus fit the Zipf distribution very well.

Never one to turn down a challenge, Plus set about checking if the frequency of words on all Plus pages matches the Zipf distribution, and as you can see in the chart, it fits remarkably well! The most popular word on Plus is “the”. “The” is mentioned 114,001 times, or 6.86% of all words. Second in line is “of” occurring 62,964 times, and third is “to”, occurring 4,5045
times. Unsurprisingly, the word “maths” features more highly than in normal usage, coming in at 40th place having been mentioned 4,829 times. “Mathematics” is at 51st and “mathematical” at 54th. “Plus” comes in at 76th, having been mentioned 2,454 times. A similar test has been done on word usage in wikipedia where it was found that Zipf’s law
holds true for the top 10000 words
.

But what lies behind Zipf’s law? There has never been a real explanation of why it should occur for languages and there is controversy surrounding whether it gives any meaningful insight into human language. Power laws relating rank to frequency have been demonstrated to occur naturally in many places — the size of cities, the number of hits on websites, the magnitude of earthquakes and the
diameters of moon craters have all been shown to follow power laws. Wentian Li demonstrated in his paper Random Texts Exhibit Zipf’s-Law-Like Word Frequency Distribution, published in IEEE Transactions on Information Theory, that words generated by randomly combining letters fit the Zipf distribution. In his
randomly generated text, the frequency distribution of word length was exponential — that is, words of length 1 occurred more than words of length 2 and so forth, with frequency declining exponentially with word length. Li showed mathematically that the power law distribution of frequency against rank is a natural consequence of the word length distribution. His underlying theory is that the rank
distribution arises naturally out of the fact that word length plays a part — long words tend not to be very common, whilst shorter words are. It is easy to see how this has occurred in the evolution of language. Li argues that as Zipf distributions arise in randomly-generated texts with no linguistic structure, the law may be a statistical artifact rather than a meaningful linguistic
property.

In any case, word length in English does not follow an exponential distribution like a randomly generated text. Looking at Plus words, you can see that the most common word length is 3:

Plus gamma distribution.

The length of words on Plus fit a gamma distribution very well.

The distribution nicely fits the curve:

  [  f = a L^ b c^ L  ]    

where $ L $ is word length, $a = 0.16, b=2.33 $ and $ c=0.49. $ This is a form of the gamma distribution and the fact that it fits is similar to the findings of Sigurd,
Eeg-Olofsson and van Weijer
. Wi’s method of relating the exponential distribution of the randomly generated text to rank — in which you knew that each word of length L occurs more frequently than each word of length L+1 and so has higher rank — does not work as the peak is for words of length 3 (not 1).

The jury remains out as to whether there is any significance in Zipf’s law — does it cast light on the way we structure language and how language evolved? Or is it simply a statistical artifact? What do you think?

Further reading

MEJ Newman has put together a very nice article Power laws, Pareto distributions and Zipf’s law in Contemporary Physics.

Read More

Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.

Tech

Nothing announces official launch date for new Ear (stick) AirPods alternatives

Published

on

By

Nothing announces official launch date for new Ear (stick) AirPods alternatives
Nothing Ear (stick) held by a model on white background



(Image credit: Nothing )

True to form, Nothing has just announced the full reveal date for its upcoming audio product, Ear (stick). 

So, an announcement about an announcement. You’ve got to hand it to Carl Pei’s marketing department, they never miss a trick.

What we’re saying is that although we still have ‘nothing’ conclusive about the features, pricing or release date for the Ear (stick) except an image of another model holding them (and we’ve seen plenty of those traipsing down the catwalk recently), we do have a date – the day when we’ll be granted official access to this information. 

That day is October 26. Nothing assures us that on this day we’ll be able to find out everything, including pricing and product specifications, during the online Ear (stick) Reveal, at 3PM BST (which is 10AM ET, or 1AM on Wednesday if you’re in Sydney, Australia) on nothing.tech (opens in new tab)

Any further information? A little. Nothing calls the Ear (stick), which is now the product’s official name, “the next generation of Nothing sound technology”, and its “most advanced audio product yet”. 

But that’s not all! Apparently, Ear (stick) are “half in-ear true wireless earbuds that balance supreme comfort with exceptional sound, made not to be felt when in use. They’re feather-light with an ergonomic design that’s moulded to your ears. Delivered in a unique charging case, inspired by classic cosmetic silhouettes, and compactly formed to simply glide into pockets.” 

Opinion: I need more than a lipstick-style case

Nothing Ear (stick) – official leaked renders pic.twitter.com/FrhKmRttmiOctober 1, 2022

See more

It’s no secret that I want Nothing’s earbuds to succeed in world dominated by AirPods; who doesn’t love a plucky, eccentric underdog? 

But in order to become some of the best true wireless earbuds on the market, there is room for improvement over the Nothing Ear 1, the company’s inaugural earbuds. 

Aside from this official ‘news’ from Nothing, leaked images and videos of the Ear (stick) have been springing up all over the internet (thank you, developer Kuba Wojciechowski) and they depict earbuds that look largely unchanged, which is a shame. 

For me, the focus needs to shift from gimmicks such as a cylindrical case with a red section at the end which twists up like a lipstick. Don’t get me wrong, I love a bit of theater, but only if the sound coming from the earbuds themselves is top dog. 

As the natural companions for the Nothing Phone 1, it makes sense for the Ear (stick) to take a place similar to that of Apple’s AirPods 3, where the flagship Ear (1) sit alongside the AirPods Pro 2 as a flagship offering. 

See, that lipstick case shape likely will not support wireless charging. That and the rumored lack of ANC means the Ear (stick) is probably arriving as the more affordable option in Nothing’s ouevre. 

For now, we sit tight until October 26. 

Becky is a senior staff writer at TechRadar (which she has been assured refers to expertise rather than age) focusing on all things audio. Before joining the team, she spent three years at What Hi-Fi? testing and reviewing everything from wallet-friendly wireless earbuds to huge high-end sound systems. Prior to gaining her MA in Journalism in 2018, Becky freelanced as an arts critic alongside a 22-year career as a professional dancer and aerialist – any love of dance starts with a love of music. Becky has previously contributed to Stuff, FourFourTwo and The Stage. When not writing, she can still be found throwing shapes in a dance studio, these days with varying degrees of success.  

Read More

Continue Reading

Tech

YouTube could make 4K videos exclusive to Premium subscribers

Published

on

By

YouTube could make 4K videos exclusive to Premium subscribers
Woman watching YouTube on mobile phone screen



(Image credit: Shutterstock / Kicking Studio)

You might soon have to buy YouTube Premium to watch 4K YouTube videos, a new user test suggests.

According to a Reddit thread (opens in new tab) highlighted on Twitter by leaker Alvin (opens in new tab), several non-Premium YouTube users have reported seeing 4K resolution (and higher) video options limited to YouTube Premium subscribers on their iOS devices. For these individuals, videos are currently only available to stream in up to 1440p (QHD) resolution.

The apparent experiment only seems to be affecting a handful of YouTube users for now, but it suggests owner Google is toying with the idea of implementing a site-wide paywall for access to high-quality video in the future.

So, after testing up to 12 ads on YouTube for non-Premium users, now some users reported that they also have to get a Premium account just to watch videos in 4K. pic.twitter.com/jJodoAxeDpOctober 1, 2022

See more

It’s no secret that Google has been searching for new ways to monetize its YouTube platform in recent months. In September, the company introduced five unskippable ads for some YouTube users as part of a separate test – an unexpected development that, naturally, didn’t go down well with much of the YouTube community. 

A resolution paywall seems a more palatable approach from Google. While annoying, the change isn’t likely to provoke the same level of ire from non-paying YouTube users as excessive ads, given that many smartphones still max out at QHD resolution anyway. 

Of course, if it encourages those who do care about high-resolution viewing to invest in the platform’s Premium subscription package, it may also be more lucrative for Google. After all, YouTube Premium, which offers ad-free viewing, background playback and the ability to download videos for offline use, currently costs $11.99 / £11.99 / AU$14.99 per month.

Suffice to say, the subscription service hasn’t taken off in quite the way Google would’ve hoped since its launch in 2014. Only around 50 million users are currently signed up to YouTube Premium, while something close to 2 billion people actively use YouTube on a monthly basis. 

Might the addition of 4K video into Premium’s perk package bump up that number? Only time will tell. We’ll be keeping an eye on our own YouTube account to see whether this resolution paywall becomes permanent in the coming months.

Axel is a London-based staff writer at TechRadar, reporting on everything from the newest movies to latest Apple developments as part of the site’s daily news output. Having previously written for publications including Esquire and FourFourTwo, Axel is well-versed in the applications of technology beyond the desktop, and his coverage extends from general reporting and analysis to in-depth interviews and opinion. 

Axel studied for a degree in English Literature at the University of Warwick before joining TechRadar in 2020, where he then earned a gold standard NCTJ qualification as part of the company’s inaugural digital training scheme. 

Read More

Continue Reading

Tech

Europe sets deadline for USB-C charging for (almost) all laptops

Published

on

By

Europe sets deadline for USB-C charging for (almost) all laptops

USB-C als Ladestandard in der EU

Mundissima / Shutterstock


Author: Michael Crider
, Staff Writer

Michael is a former graphic designer who’s been building and tweaking desktop computers for longer than he cares to admit. His interests include folk music, football, science fiction, and salsa verde, in no particular order.

Read More

Continue Reading

Trending

Copyright © 2022 Xanatan