Core algorithms deployed

Question

To demonstrate the importance of algorithms (e.g. to students and professors who don't do theory or are even from entirely different fields) it is sometimes useful to have ready at hand a list of examples where core algorithms have been deployed in commercial, governmental, or widely-used software/hardware.

I am looking for such examples that satisfy the following criteria:

The software/hardware using the algorithm should be in wide use right now.
The example should be specific. Please give a reference to a specific system and a specific algorithm.
E.g., in "algorithm X is useful for image processing" the term "image processing" is not specific enough; in "Google search uses graph algorithms" the term "graph algorithms" is not specific enough.
The algorithm should be taught in typical undergraduate or Ph.D. classes in algorithms or data structures. Ideally, the algorithm is covered in typical algorithms textbooks. E.g., "well-known system X uses little-known algorithm Y" is not good.

Update:

Thanks again for the great answers and links! Some people comment that it is hard to satisfy the criteria because core algorithms are so pervasive that it's hard to point to a specific use. I see the difficulty. But I think it is worthwhile to come up with specific examples because in my experience telling people: "Look, algorithms are important because they are just about everywhere!" does not work.

Comments are not for extended discussion; this conversation has been moved to chat. — Bjørn Kjos-Hanssen, Sep 3, 2018 at 5:56

Community · Accepted Answer · 2013-11-24 11:48:02Z

Algorithms that are the main driver behind a system are, in my opinion, easier to find in non-algorithms courses for the same reason theorems with immediate applications are easier to find in applied mathematics rather than pure mathematics courses. It is rare for a practical problem to have the exact structure of the abstract problem in a lecture. To be argumentative, I see no reason why fashionable algorithms course material such as Strassen's multiplication, the AKS primality test, or the Moser-Tardos algorithm is relevant for low-level practical problems of implementing a video database, an optimizing compiler, an operating system, a network congestion control system or any other system. The value of these courses is learning that there are intricate ways to exploit the structure of a problem to find efficient solutions. Advanced algorithms is also where one meets simple algorithms whose analysis is non-trivial. For this reason, I would not dismiss simple randomized algorithms or PageRank.

I think you can choose any large piece of software and find basic and advanced algorithms implemented in it. As a case study, I've done this for the Linux kernel, and shown a few examples from Chromium.

Basic Data Structures and Algorithms in the Linux kernel

Links are to the source code on github.

Linked list, doubly linked list, lock-free linked list.
B+ Trees with comments telling you what you can't find in the textbooks.

A relatively simple B+Tree implementation. I have written it as a learning exercise to understand how B+Trees work. Turned out to be useful as well.

...

A tricks was used that is not commonly found in textbooks. The lowest values are to the right, not to the left. All used slots within a node are on the left, all unused slots contain NUL values. Most operations simply loop once over all slots and terminate on the first NUL.
Priority sorted lists used for mutexes, drivers, etc.
Red-Black trees are used for scheduling, virtual memory management, to track file descriptors and directory entries,etc.
Interval trees
Radix trees, are used for memory management, NFS related lookups and networking related functionality.

A common use of the radix tree is to store pointers to struct pages;
Priority heap, which is literally, a textbook implementation, used in the control group system.

Simple insertion-only static-sized priority heap containing pointers, based on CLR, chapter 7
Hash functions, with a reference to Knuth and to a paper.

Knuth recommends primes in approximately golden ratio to the maximum integer representable by a machine word for multiplicative hashing. Chuck Lever verified the effectiveness of this technique:

http://www.citi.umich.edu/techreports/reports/citi-tr-00-1.pdf

These primes are chosen to be bit-sparse, that is operations on them can use shifts and additions instead of multiplications for machines where multiplications are slow.
Some parts of the code, such as this driver, implement their own hash function.

hash function using a Rotating Hash algorithm

Knuth, D. The Art of Computer Programming, Volume 3: Sorting and Searching, Chapter 6.4. Addison Wesley, 1973
Hash tables used to implement inodes, file system integrity checks etc.
Bit arrays, which are used for dealing with flags, interrupts, etc. and are featured in Knuth Vol. 4.
Semaphores and spin locks
Binary search is used for interrupt handling, register cache lookup, etc.
Binary search with B-trees
Depth first search and variant used in directory configuration.

Performs a modified depth-first walk of the namespace tree, starting (and ending) at the node specified by start_handle. The callback function is called whenever a node that matches the type parameter is found. If the callback function returns a non-zero value, the search is terminated immediately and this value is returned to the caller.
Breadth first search is used to check correctness of locking at runtime.
Merge sort on linked lists is used for garbage collection, file system management, etc.
Bubble sort is amazingly implemented too, in a driver library.
Knuth-Morris-Pratt string matching,

Implements a linear-time string-matching algorithm due to Knuth, Morris, and Pratt [1]. Their algorithm avoids the explicit computation of the transition function DELTA altogether. Its matching time is O(n), for n being length(text), using just an auxiliary function PI[1..m], for m being length(pattern), precomputed from the pattern in time O(m). The array PI allows the transition function DELTA to be computed efficiently "on the fly" as needed. Roughly speaking, for any state "q" = 0,1,...,m and any character "a" in SIGMA, the value PI["q"] contains the information that is independent of "a" and is needed to compute DELTA("q", "a") 2. Since the array PI has only m entries, whereas DELTA has O(m|SIGMA|) entries, we save a factor of |SIGMA| in the preprocessing time by computing PI rather than DELTA.

[1] Cormen, Leiserson, Rivest, Stein Introdcution to Algorithms, 2nd Edition, MIT Press

[2] See finite automation theory
Boyer-Moore pattern matching with references and recommendations for when to prefer the alternative.

Implements Boyer-Moore string matching algorithm:

[1] A Fast String Searching Algorithm, R.S. Boyer and Moore. Communications of the Association for Computing Machinery, 20(10), 1977, pp. 762-772. http://www.cs.utexas.edu/users/moore/publications/fstrpos.pdf

[2] Handbook of Exact String Matching Algorithms, Thierry Lecroq, 2004 http://www-igm.univ-mlv.fr/~lecroq/string/string.pdf

Note: Since Boyer-Moore (BM) performs searches for matchings from right to left, it's still possible that a matching could be spread over multiple blocks, in that case this algorithm won't find any coincidence.

If you're willing to ensure that such thing won't ever happen, use the Knuth-Pratt-Morris (KMP) implementation instead. In conclusion, choose the proper string search algorithm depending on your setting.

Say you're using the textsearch infrastructure for filtering, NIDS or
any similar security focused purpose, then go KMP. Otherwise, if you really care about performance, say you're classifying packets to apply Quality of Service (QoS) policies, and you don't mind about possible matchings spread over multiple fragments, then go BM.

Data Structures and Algorithms in the Chromium Web Browser

Links are to the source code on Google code. I'm only going to list a few. I would suggest using the search feature to look up your favourite algorithm or data structure.

Splay trees.

The tree is also parameterized by an allocation policy (Allocator). The policy is used for allocating lists in the C free store or the zone; see zone.h.
Voronoi diagrams are used in a demo.
Tabbing based on Bresenham's algorithm.

There are also such data structures and algorithms in the third-party code included in the Chromium code.

Binary trees
Red-Black trees

Conclusion of Julian Walker

Red black trees are interesting beasts. They're believed to be simpler than AVL trees (their direct competitor), and at first glance this seems to be the case because insertion is a breeze. However, when one begins to play with the deletion algorithm, red black trees become very tricky. However, the counterweight to this added complexity is that both insertion and deletion can be implemented using a single pass, top-down algorithm. Such is not the case with AVL trees, where only the insertion algorithm can be written top-down. Deletion from an AVL tree requires a bottom-up algorithm.

...

Red black trees are popular, as most data structures with a whimsical name. For example, in Java and C++, the library map structures are typically implemented with a red black tree. Red black trees are also comparable in speed to AVL trees. While the balance is not quite as good, the work it takes to maintain balance is usually better in a red black tree. There are a few misconceptions floating around, but for the most part the hype about red black trees is accurate.
AVL trees
Rabin-Karp string matching is used for compression.
Compute the suffixes of an automaton.
Bloom filter implemented by Apple Inc.
Bresenham's algorithm.

Programming Language Libraries

I think they are worth considering. The programming languages designers thought it was worth the time and effort of some engineers to implement these data structures and algorithms so others would not have to. The existence of libraries is part of the reason we can find basic data structures reimplemented in software that is written in C but less so for Java applications.

The C++ STL includes lists, stacks, queues, maps, vectors, and algorithms for sorting, searching and heap manipulation.
The Java API is very extensive and covers much more.
The Boost C++ library includes algorithms like Boyer-Moore and Knuth-Morris-Pratt string matching algorithms.

Allocation and Scheduling Algorithms

I find these interesting because even though they are called heuristics, the policy you use dictates the type of algorithm and data structure that are required, so one need to know about stacks and queues.

Least Recently Used can be implemented in multiple ways. A list-based implementation in the Linux kernel.
Other possibilities are First In First Out, Least Frequently Used, and Round Robin.
A variant of FIFO was used by the VAX/VMS system.
The Clock algorithm by Richard Carr is used for page frame replacement in Linux.
The Intel i860 processor used a random replacement policy.
Adaptive Replacement Cache is used in some IBM storage controllers, and was used in PostgreSQL though only briefly due to patent concerns.
The Buddy memory allocation algorithm, which is discussed by Knuth in TAOCP Vol. 1 is used in the Linux kernel, and the jemalloc concurrent allocator used by FreeBSD and in facebook.

Core utils in *nix systems

grep and awk both implement the Thompson-McNaughton-Yamada construction of NFAs from regular expressions, which apparently even beats the Perl implementation.
tsort implements topological sort.
fgrep implements the Aho-Corasick string matching algorithm.
GNU grep, implements the Boyer-Moore algorithm according to the author Mike Haertel.
crypt(1) on Unix implemented a variant of the encryption algorithm in the Enigma machine.
Unix diff implemented by Doug McIllroy, based on a prototype co-written with James Hunt, performs better than the standard dynamic programming algorithm used to compute Levenshtein distances. The Linux version computes the shortest edit distance.

Cryptographic Algorithms

This could be a very long list. Cryptographic algorithms are implemented in all software that can perform secure communications or transactions.

Merkle trees, specifically the Tiger Tree Hash variant, were used in peer-to-peer applications such as GTK Gnutella and LimeWire.
MD5 is used to provide a checksum for software packages and is used for integrity checks on *nix systems (Linux implementation) and is also supported on Windows and OS X.
OpenSSL implements many cryptographic algorithms including AES, Blowfish, DES, SHA-1, SHA-2, RSA, DES, etc.

Compilers

LALR parsing is implemented by yacc and bison.
Dominator algorithms are used in most optimizing compilers based on SSA form.
lex and flex compile regular expressions into NFAs.

Compression and Image Processing

The Lempel-Ziv algorithms for the GIF image format are implemented in image manipulation programs, starting from the *nix utility convert to complex programs.
Run length encoding is used to generate PCX files (used by the original Paintbrush program), compressed BMP files and TIFF files.
Wavelet compression is the basis for JPEG 2000 so all digital cameras that produce JPEG 2000 files will be implementing this algorithm.
Reed-Solomon error correction is implemented in the Linux kernel, CD drives, barcode readers and was combined with convolution for image transmission from Voyager.

Conflict Driven Clause Learning

Since the year 2000, the running time of SAT solvers on industrial benchmarks (usually from the hardware industry, though though other sources are used too) has decreased nearly exponentially every year. A very important part of this development is the Conflict Driven Clause Learning algorithm that combines the Boolean Constraint Propagation algorithm in the original paper of Davis Logemann and Loveland with the technique of clause learning that originated in constraint programming and artificial intelligence research. For specific, industrial modelling, SAT is considered an easy problem (see this discussion). To me, this is one of the greatest success stories in recent times because it combines algorithmic advances spread over several years, clever engineering ideas, experimental evaluation, and a concerted communal effort to solve the problem. The CACM article by Malik and Zhang is a good read. This algorithm is taught in many universities (I have attended four where it was the case) but typically in a logic or formal methods class.

Applications of SAT solvers are numerous. IBM, Intel and many other companies have their own SAT solver implementations. The package manager in OpenSUSE also uses a SAT solver.

Nice list, but I'm not sure about including the OS scheduling algorithms and CDCL since they are heuristic and in the worst-case can be quite bad. — Huck Bennett, Nov 13, 2013 at 1:48
@HuckBennett, CDCL is an algorithm parameterized by heuristics but is not itself a heuristic. It has worst case exponential behaviour but it is non-trivial to show that. Moreover, we cannot do provably better and it is the best we can do in practice, so I feel all computer scientists should know about it! As for LRU,FIFO, etc. they are heuristics, but, as with ARC, may require clever algorithms or data structures to implement. — Vijay D, Nov 13, 2013 at 2:01
I'm not saying that CDCL isn't cool or important (it is) but I don't think it's really an algorithm as is usually meant in theory since we can't really prove anything about it. "Here's an algorithm that seems to work well in practice" is more the domain of AI and formal methods. AFAIK it has $O (2^{n})$ worst-case running time and no other guarantees using other formal analyses. We actually can do better for various NP-complete variants of SAT such as Schoning's $O^{*} (1.334)$ -time algorithm for 3-SAT. — Huck Bennett, Nov 13, 2013 at 3:51
Wouldn't such a comment have applied to Simplex: initially not well understood and later shown to be exponential but works in practice and much later shown to have polynomial smoothed complexity? CDCL is interesting for algorithm analysis because you need to go via proof complexity to derive families of formulae exhibiting worst case behaviour, and also to show it can be exponentially more succinct than some variants of resolution. There are various extensions, such as symmetry breaking and autarky techniques for which such an analysis is still open. — Vijay D, Nov 13, 2013 at 8:43
@VijayD Thanks for the interesting list. I find the use of Red-Black trees in the Linux kernel compelling. — Manu, Nov 13, 2013 at 11:27
Nice update! I like the list of basic data structures that show up in the Linux kernel. Ideally, it would be great if one could find more examples like that but that are not strictly confined to data structures. — Manu, Nov 14, 2013 at 16:13
@EmanueleViola, please elaborate. By more examples, do you mean more systems, or do you mean, more algorithms in the same system? What would you like to see beyond data structures? (maybe add it to the original question) — Vijay D, Nov 14, 2013 at 17:30
@VijayD: I mean more algorithms in some system. E.g. notable members of sorting, dynamic programming, greedy, flow, LP, graph algorithms, approx algorithms, string matching, number-theoretic, and whatever else you think is core material. — Manu, Nov 15, 2013 at 12:35
@EmanueleViola, I've added a few more examples. The post is long now, so I don't want to extend it. Maybe you should ask a new question specifically about implementations of Dijkstra, Simplex, Bloom filters as part of a real system like Linux, Chrome, a web server etc. I think you are more likely to get good answers if you are specific. — Vijay D, Nov 17, 2013 at 4:32
wow, did you managed to write down the list off the top of your head? :) — DRC, Nov 24, 2013 at 10:59
Wow, is this the most upvoted answer on cstheory now :)? Did someone post this question on reddit? — Sasho Nikolov, Nov 24, 2013 at 18:25
One of these days, I am going to add Wikipedia links along with the source code links. (unless someone else does it before.) — lprsd, Nov 28, 2013 at 6:18

Wes Turner · Accepted Answer · 2013-11-24 07:42:00Z

-3

Rosetta Code lists applied algorithms by Programming Task (692) and by Programming Language (518) with Semantic MediaWiki.

answered Nov 24, 2013 at 7:42

community wiki

Wes Turner

How is this an example of "core algorithms... deployed in commercial, governmental, or widely-used software/hardware"?
– David Richerby
Nov 25, 2013 at 16:34
It would be useful to cross-reference the implementations of each of the excellent algorithms listed in other answers here to Wikipedia/DBpedia URIs. There aren't Wikipedia/DBpedia URIs for all of these algorithms; but there are Rosetta Code pages.
– Wes Turner
Nov 26, 2013 at 10:46
bigocheatsheet.com also lists Big-O complexity and links to Wikipedia articles for quite a few algorithms.
– Wes Turner
Nov 26, 2013 at 10:48
The question asks for examples of core algorithms used in significant pieces of software. "Here's a website with algorithms implemented in a million languages" doesn't answer that question at all. In fact, it's the exact opposite of what the question is looking for.
– David Richerby
Nov 26, 2013 at 13:32
A useful, contextually relevant reference nonetheless.
– Wes Turner
Nov 27, 2013 at 0:36

Add a comment |

Christian Sommer · Accepted Answer · 2013-11-24 04:15:23Z

Customizable Route Planning (Daniel Delling, Andrew V. Goldberg, Thomas Pajor, and Renato F. Werneck) http://research.microsoft.com/apps/pubs/default.aspx?id=145688

powers Bing Maps: http://www.bing.com/blogs/site_blogs/b/maps/archive/2012/01/05/bing-maps-new-routing-engine.aspx

vzn · Accepted Answer · 2013-11-16 16:46:01Z

maybe all the major/preferred algorithms of interest to this audience have been mentioned at this point. however, a few more deserve mention for completeness. & some analysis of what is considered a significant algorithm is relevant here.

in CS & IT fields there seems to be a phenomenon noticed long ago in AI called "moving the goalposts". this is a psychological phenomenon where the field advances relatively quickly but people quickly mentally adjust to "the new normal" and take real or even breakthrough advances as mundane or unremarkable in retrospect, after accomplished, ie downplayed or minimized. this is highly captured in this question in the way that algorithms move from R&D into "deployment". quoting the author of the question in later comments:

In fact, a negligible fraction of all the code that gets written is implementing anything that is interesting from an algorithmic point of view.

but this is problematic and basically a TCS-centric redefinition of the word "algorithm". presumably the interesting algorithms are advanced. does that mean that if a problem is reduced to an advanced algorithm, its no longer "interesting"? and "advanced" is clearly a moving target. so there is a way to define "algorithms" narrowly, or broadly. it seems the TCS definition changes on context, but note even in TCS, there is a trend toward the broad definition eg in the so-called "algorithmic lens".

sometimes the most ubiquitous algorithms are also the most overlooked! the internet and WWW is a large environment/near-ecology for algorithms. still relatively young at only about 2 decades old (invented ~1991) it has grown massively and exponentially in a short amount of time. WWW site growth has probably even outpaced the famous exponential Moores law.

the internet/WWW are supported by many sophisticated algorithms. the internet has complex routing algorithms built into routers (again powering multi-billion-dollar corporations such as Cisco). some advanced theory is applicable there eg in routing algorithms. these algorithms were a subject of emerging, advanced/cutting edge research decades ago however a now so finetuned & well understood that its somewhat invisible.

we should not so soon forget that decades ago, leading researchers were not even sure if the internet world work or was possible (seen in early packet switching research, a radical new design pattern at the time departing from the prior circuit-switching), and even a few years ago there were fears that it would fail to scale at some point and begin to fail due to overwhelming spikes in volume.

it also uses sophisticated error detection/correction. the internet probably the largest, most fault-tolerant system ever built by humans, still growing.

next, there is a strong case to make that the algorithms powering the WWW are advanced. HTTP & web servers are highly tuned/optimized and also use advanced security/encryption protocols (HTTPS). the rendering logic of a web page has become extremely advanced in HTML5 & CSS3, along with the Javascript programming language.

the relatively new CSS has various principles similar to OOP programming such as reusability and inheritance. speaking of typesetting, TeX is an important, internally complex scientific typesetting system (not so different than a programming language) invented by Knuth that can now be rendered on web pages (and is used in possibly hundreds of thousands of scientific papers or more).

another relatively new area of algorithms building on the internet, still emerging, those based on collective intelligence. stackexchange software itself is an example of a sophisticated collective intelligence system. social networking also exhibits the key features of collective intelligence and features are continually being added to increase that intelligence (for example facebook "Likes" are only a few years old). the field of rating systems is based on collaborative filtering algorithms and is still evolving based on new research and applications.

so in short, all revolutionary successes transforming daily human experience actually quite far beyond merely "field goals". as the title of the question states, all core algorithms deployed. now so ubiquitous and invisible as to be something like the IT expression, "part of the plumbing".

many citations could be added to this. here is one to start: DARPA and the internet revolution by Waldrop — vzn, Nov 19, 2013 at 18:32
another ref on internet optimization, biography of Danny Lewin, cofounder of akamai, "the genius who transformed the internet" — vzn, Dec 1, 2013 at 3:40

ClericGunem · Accepted Answer · 2013-11-16 05:27:31Z

An example of FFT

I once helped port an FFT algorithm to a different system language.

The algorithm was being used to determine line breaks in coaxial delivery of cable tv/internet/phone. Basically a technician would request a signal to be sent to the customer's box, at the same time they would pull up a real-time display of the statistics for the specific customer, such as QoS, dB, .... The technician could use the data and a graph to determine within a few feet between the house and pole where a partial break existed (or multiple breaks as I was told).

As mentioned above FFT is widely used, but this was one of the more blatant and obvious (in terms of why and how) I have seen in practice.

Sorry I had to keep it a high level.

vzn · Accepted Answer · 2013-11-14 23:43:28Z

two other personal favorite examples firmly rooted in computer science but maybe easily overlooked by abstractionist theoreticians, which have undergone enormous/transformative advances and had major-to-massive practical/applied impact in daily life over the last few decades. already an entire generation has grown up not knowing the world without them. basically the category of Modelling and simulation.

physics simulation algorithms. mainly using Newtons laws but using other laws (such as fluid dynamics). used in a wide variety of applications ranging from engineering applications, video games, and sometimes movies. this is responsible also for significantly improving safety, efficiency, or reliability of eg cars & airplanes via subjecting virtual/test designs to simulated stresses. a major related ongoing research area from bioinformatics with massive implications in biology eg drug design, disease prevention etc: protein folding/structure prediction. also note this years Nobel Prize in Chemistry was awarded for chemistry simulation to Karplus, Levitt, Warshel. physics simulation algorithms are highly involved in nuclear weapons safety/testing eg at Los Alamos laboratories.
raytracing/CGI algorithms. this started out as a research topic merely a few decades ago [a friend got his masters degree in CS writing raytracing algorithms] but became very applied in eg games and the movie making business, reaching extraordinary levels of verisimilitude that is responsible for large amounts of special effects in movies. these industries have literally billions of dollars invested and riding on these algorithms and entire large corporations are based on harnessing them, eg Pixar. mostly initially used in eg scifi movies, the technique is now so widespread it is routinely used even in "typical" movies. for example recently The Great Gatsby relied heavily on CGI effects to draw convincing or stylized environments, retouch the film/characters, etc.

Leeor · Accepted Answer · 2013-11-14 13:51:43Z

A fascinating algorithmic problem arises in medical application of CT scan. In Computed Tomography (CT), the body is exposed to X-rays from different angles. At one end of the scanner are the X-ray transmitters and at the other end the sensors. From such a series of scans, an image is reconstructed for the physician to examine!

The filtered back projection algorithm is the basis for reconstruction of an image from a set of scans. This algorithm is actually a form of an approximation problem in which the "signal" is sampled below Nyquist rate. This algorithm is in use "behind the scenes" at all hospitals and the basic filtered back projection utilizes undergraduate math such as Fourier transforms to achieve the Fourier Slice Theorem.

user19461 · Accepted Answer · 2013-11-14 10:12:19Z

Thinking of very basic algorithms

Random number generators are found everywhere and specifically in all games.
Databases are composed of many algorithms, including B+, Hashes, priority queues, regular expression, criptography, sorting, etc... A friend of mine says SGBDs are at the top of the computing food chain.
Sort is used everywhere, for example in Excel. It is actually used all time in real life, but usually humans use ad-hoc algorithms
Parity bits are used all around
Huffman encoding is in compression and transmission software
Stacks (LIFO) are used everywhere. Inside programming languages, in CPUs, etc...

Nice to show they appearing in real life:

A. Many groups use a kind of covering tree algorithm to communicate, by dividing telephone lists in hierarchical way among people B. Cars in a intersection usually use a round-robin algorithm (in a voluntary way) C. Most places, as banks and hospital, organize their clientes in a FIFO algorithm

Sorting isn't an algorithm. It's a task, i.e., something you wish to perform, for which you must design (or, in practice, choose) an algorithm. — David Richerby, Nov 14, 2013 at 22:22
These don't seem to be specific examples as requested in the question. — Kaveh, Nov 15, 2013 at 16:29

vzn · Accepted Answer · 2013-11-14 02:19:03Z

this relatively new book is worth considering as a complete/detailed answer to the question in convenient, extended/collected form and which could be used as supplemental material for an algorithms class. [some of these have already been mentioned; the strong overlap itself is notable.]

Nine Algorithms That Changed the Future: The Ingenious Ideas That Drive Today's Computers by MacCormick
- search engine indexing
- Pagerank
- public key cryptography
- error correcting codes
- pattern recognition
- data compression
- databases
- digital signatures
- computability theory
another reference somewhat similar but more theoretical, The Best of the 20th Century: Editors Name Top 10 Algorithms Cipra/SIAM.
- Monte Carlo method
- simplex algorithm
- Krylov subspace iteration
- decompositional matrix computation
- Fortran compiler
- QR matrix algorithm
- quicksort
- FFT
- integer relation detection
- fast multipole algorithm

2nd ref is originally from January/February 2000 issue of Computing in Science & Engineering, a joint publication of the American Institute of Physics and the IEEE Computer Society. compiled by guest editors Jack Dongarra of the University of Tennessee and Oak Ridge National Laboratory and Francis Sullivan of the Center for Computing Sciences at the Institute for Defense Analyses — vzn, Nov 14, 2013 at 15:58

Grigory Yaroslavtsev · Accepted Answer · 2013-11-14 00:11:43Z

Viterbi's algorithm, which is still widely used in speech recognition and multiple other applications: http://en.wikipedia.org/wiki/Viterbi_algorithm The algorithm itself is basic dynamic programming.

From Wikipedia: "The Viterbi algorithm was proposed by Andrew Viterbi in 1967 as a decoding algorithm for convolutional codes over noisy digital communication links.[1] The algorithm has found universal application in decoding the convolutional codes used in both CDMA and GSM digital cellular, dial-up modems, satellite, deep-space communications, and 802.11 wireless LANs. It is now also commonly used in speech recognition, speech synthesis, keyword spotting, computational linguistics, and bioinformatics. For example, in speech-to-text (speech recognition), the acoustic signal is treated as the observed sequence of events, and a string of text is considered to be the "hidden cause" of the acoustic signal. The Viterbi algorithm finds the most likely string of text given the acoustic signal."

Joshua Grochow · Accepted Answer · 2013-11-13 20:08:12Z

In the last decade algorithms have been used to increase the number (and quality, I think?) of kidney transplants through various kidney donor matching programs. I've been having trouble finding the latest news on this, but here are at least a few pointers:

As recently as 2007 the Alliance for Paired Donation was using an algorithm of Abraham, Blum, and Sandholm. They may still be using it, but I couldn't find out by searching online. Although this algorithm is almost surely not covered in "standard" courses, it combines several fundamental ideas that surely are taught in such courses to provide a good enough algorithm for a problem that is, in general, NP-complete (a variant of Cycle Cover).
The National Kidney Registry also uses some standard algorithms, including (at one point) CPLEX. This led to an actually performed chain of transplants linking 60 people.

This is one of my favorite examples not just of the success of algorithms, but of the importance of studying algorithms for NP-complete problems. They can literally save lives, and have already done so!

Also, a simpler version of these algorithms is used to trade board games: okasaki.blogspot.co.uk/2008/03/what-heck-is-math-trade.html — Radu GRIGore, Nov 19, 2013 at 21:49

lynxoid · Accepted Answer · 2013-11-13 17:11:45Z

10

Finding a Eulerian path is at the base of the genome assembly -- a task commonly performed when working with full genomes (in bioinformatics, medicine, forensics, ecology).

UPDATE Forgot this obvious one: UPS, FedEx, USPS all have to solve large instances of Traveling Salesman Problem every night. Saves a lot of time and money for them to send the drivers on an optimal route.

UPDATE2 Minimum feedback vertex set problem is used for deadlock resolution in many OSes.

edited Nov 13, 2013 at 17:11

community wiki

lynxoid

Are you sure TSP is the problem that the parcel delivery companies are trying to solve? I thought a bigger practical challenge was knapsack and other kinds of packing problems.
– András Salamon
Nov 13, 2013 at 17:28
The assignments for the drivers change every day (i.e. UPS guy doesn't need to visit the same house every day), so the routes have to be updated on a daily basis. It is not a pure TSP - there are additional constraints such as one-way streets, no U-turns, delivering packages on one side of the street, but not the other.
– lynxoid
Nov 14, 2013 at 21:40
I'm sure packing is also important though.
– lynxoid
Nov 14, 2013 at 21:42

Add a comment |

Anon · Accepted Answer · 2013-11-13 16:40:21Z

An amazingly successful (hardware) algorithm is the power-on reset.

Without a system whereby a computer is in a known state when power is applied, nothing else happens right.

Power-on reset is why anything works at all that has a CPU in it, whether considered embedded or otherwise.

Next time you're at the watering hole for programmers and computer scientists, raise your glass of cherry soda to the power-on reset.

Power-on reset isn't an algorithm. It's a task, i.e., something you wish to perform, for which you must design an algorithm. — David Richerby, Nov 14, 2013 at 22:21

Ravi Kiran · Accepted Answer · 2013-11-13 16:27:19Z

Wikipedia has a decent collection of algorithms/applications classified more or less in a list. Microsoft provides the top cited papers but without any explicit explanation of the area of computer science nor the application. There is also a chronological list from different CS conferences _http://jeffhuang.com/best_paper_awards.html_ compiled by Prof. Huang.

Spectral Clustering is an elegant clustering algorithm, known being Normalized cuts algorithm introduced by Jianbo Shi and Jitendra Malik for image segmentation. It has also well been developed in Data clustering applications, being an good intersection between the two communities.

Alnitak · Accepted Answer · 2013-11-13 10:54:38Z

I like this system for saving the maximum number of lives in the UK with kidney transplants, based on maximum matching algorithms: Paired and Altruistic Kidney Donation. They match up people needing kidneys who have a non-matching friend/relative willing to donate, with other people in the same situation, in a maximal way. Then on donation day, all the donors donate at the same time, followed by a speedy transport of kidneys around the country to the recipients.

TimRing · Accepted Answer · 2013-11-13 09:58:05Z

Bresenham's line algorithm is the single most useful algorithm I've come across. Easy to understand Ive used it for lots of applications from line drawing to a complex spliner for 3d casting engine to a complex polygon renderer, as well as for complex animation and scaling uses.

John Wu · Accepted Answer · 2013-11-13 08:21:11Z

11

Funny that I saw this question today and then coincidentally clicked this link on the many uses of the Fourier Transform.

answered Nov 13, 2013 at 8:21

community wiki

John Wu

Add a comment |

vzn · Accepted Answer · 2013-11-13 01:59:46Z

singular value decomposition (SVD) has a strong connection to statistical factor analysis or principal components analysis and is comprehensible within an undergraduate linear algebra or statistics class, and has many important theoretical properties. it also plays a role in image compression algorithms. it played a key element in the winning entries in the $1M Netflix prize contest (one of the worlds largest datamining competitions in history) and is now implemented on their site to predict user ratings. it is also known to be highly related to Hebbian self-organizing neural networks which originate in biological theory.

there is some connection also to gradient descent which is used widely in machine learning & artificial neural networks and as a very universally applied optimization technique of which case Newton's method is a basic 2d form. there is a gradient-descent algorithm for obtaining the SVD.

oakad · Accepted Answer · 2013-11-13 00:19:02Z

I'm rather surprised that with all the fancy algorithms above, nobody mentioned the venerable Lempel-Ziv family of compression algorithms (invented in 1977/78).

Those are used everywhere - text to image to stream processing. It is quite possible that LZ* is a single most used algorithm family in existence.
Dictionary compression was a considerable breakthrough in compression theory and a stark departure from Shannon-Fano style approach.
The algorithms in the family are rather straightforward and easy to comprehend.

Update

Apparently it was mentioned briefly already.

Dai Le · Accepted Answer · 2013-11-12 23:12:34Z

GNU grep is a command line tool for searching one or more input files for lines containing a match to a specified pattern. It is well-known that grep is very fast! Here's a quote from its author Mike Haertel (taken from here):

GNU grep uses the well-known Boyer-Moore algorithm, which looks first for the
final letter of the target string, and uses a lookup table to tell it how far
ahead it can skip in the input whenever it finds a non-matching character.

mhum · Accepted Answer · 2013-11-12 22:25:44Z

If you're also including PhD-level stuff, many (most?) graduate CS programs include some course in coding theory. If you have a course in coding theory, you will definitely cover the Reed-Solomon code which is integral to how compact discs work and Huffman encoding which is used in JPEG, MP3, and ZIP file formats. Depending on the orientation of the course, you may also cover Lempel-Ziv which is used in the GIF format. Personally, I got Lempel-Ziv in an undergraduate algorithms course, but I think that might be atypical.

And I got a lecture on Huffman encoding as an undergrad, which was required for a project. — Brian S, Nov 12, 2013 at 22:59
Huffman is in one of the first chapters of CLRS, so it should definitely qualify — Sasho Nikolov, Nov 14, 2013 at 14:42

MSalters · Accepted Answer · 2013-11-12 20:33:40Z

13

A* is used in many Personal Navigation Devices (aka GPS units)
A* is very well defined, and has been implemented fairly straightforward.
A* isn't entirely trivial, but it doesn't take a Ph.D. to understand it.

answered Nov 12, 2013 at 20:33

community wiki

MSalters

A* is also frequently taught in game design. I don't think modern 3D games generally use A* for NPC navigation, but 2D/isometric games, as well as older games, do make use of the algorithm.
– Brian S
Nov 12, 2013 at 22:57
@BrianS Do you know examples of pathfinding algorithms used in 3D games, specifically enemies NPCs in games (such as a shooter npc) I remember reading something like... dividing a map in hexagonal sectors and use that as the node, instead of squares, and that allowed a smoother movement.
– Goodwine
Nov 13, 2013 at 17:24
@Goodwine, Sorry, I do not have any real-world examples of pathfinding algorithms in 3D games. My personal experience has been with "cube"-like environments (map made of cubes, which characters stand on -- basically 2D, despite 3D rendering) and dummy AIs used for testing player-characters against.
– Brian S
Nov 13, 2013 at 22:35

Add a comment |

Sasho Nikolov · Accepted Answer · 2013-11-12 19:05:27Z

The CountMin Sketch and Count Sketch, from data streaming algorithms, are used in industrial systems for network traffic analysis and analysis of very large unstructured data. These are data structure that summarize the frequency of a huge number of items in a tiny amount of space. Of course they do that approximately, and the guarantee is that, with high probability, the frequency of any item is approximated to within an an $ε$ -fraction of the total "mass" of all items (the mass is the first or second moment of the frequency vector). This is good enough to find "trending" items, which is what you need in a lot of applications.

Some examples of industrial uses of these data structures are:

Google's Sawzall system for unstructured data analysis uses the Count Sketch to implement a 'most popular items' function
AT&T's Gigascope "stream database" system for network traffic monitoring implements the CountMin sketch.
Sprint's Continous Monitoring (CMON) system implements CountMin.

Here is also a site that collects information about applications of CountMin.

As far as teaching, I know that basic sketching techniques are taught at Princeton in undergraduate discrete math courses. I was taught the CountMin sketch in my first algorithms course. In any case, the analysis of CountMin is simpler than the analysis for almost any other randomized algorithm: it's a straightforward application of pairwise independence and Markov's inequality. If this is not standard material in most algorithms courses, I think it's for historical reasons.

Great examples (though not quite core algo right now).
– Manu
Nov 13, 2013 at 11:31 — Manu, Nov 13, 2013 at 11:31

mhum · Accepted Answer · 2013-11-12 18:25:26Z

30

As I understand it, the National Resident Matching Program was for a long time just a straight application of the Gale-Shapley algorithm for the stable marriage problem. It has since been slightly updated to handle some extra details like spousal assignments (aka the "two-body problem"), etc...

answered Nov 12, 2013 at 18:25

mhum

3,3821 gold badge21 silver badges22 bronze badges

I am not sure stable marriage is typical algorithms material.
– Manu
Nov 12, 2013 at 18:52
17

It's in the Tardos and Kleinberg Algorithms Design book, and also in Motwani's Randomized Algorithms, and both books are widely used. Stable marriage may not be universally taught in algorithms courses, but it certainly is taught in a lot of them.
– Sasho Nikolov
Nov 12, 2013 at 19:11
10

A quick search reveals that it has shown up in Berkeley's CS70, MIT's 6.042, UMD's CMSC451, etc...
– mhum
Nov 12, 2013 at 19:18
1

Interestingly, when you add in spousal assignments, the problem becomes NP-complete: arxiv.org/abs/1308.4534. However, in practice this seems not to cause too much of a problem: en.wikipedia.org/wiki/…
– Joshua Grochow
Nov 12, 2013 at 21:16
2

@EmanueleViola while it might not be covered traditionally, its inclusion in the Kleinberg/Tardos book has made it more popular, (and if not it should be !)
– Suresh Venkat
Nov 12, 2013 at 22:46
I agree that the inclusion in Kleinberg Tardos' book makes this qualify.
– Manu
Nov 13, 2013 at 11:23

Add a comment |

Darth Egregious · Accepted Answer · 2013-11-12 18:02:35Z

7

The Knuth-Morris-Pratt string search is widely used, specific, and taught in undergrad / graduate CS.

answered Nov 12, 2013 at 18:02

community wiki

Darth Egregious

3

Would be good if you could point to a specific use. Something like MS Word uses KMP.
– Manu
Nov 12, 2013 at 18:50

Add a comment |

Suresh Venkat · Accepted Answer · 2013-11-12 17:26:27Z

19

More generally, the Kanellakis prize is awarded by the ACM for precisely such theoretical discoveries that have had a major impact in practice.

the 2012 award is for locality-sensitive hashing, which has become a go-to method for dimensionality reduction in data mining for near neighbor problems (and is relatively easy to teach - at least the algorithm itself)

answered Nov 12, 2013 at 17:26

Suresh Venkat

32k4 gold badges94 silver badges271 bronze badges

I think this is teachable but not widely taught.
– Manu
Nov 12, 2013 at 18:51
3

Unfortunate, but true. However, variants of LSH (like the Count-min sketch and relatives) are beginning to appear in "large-data" or "data mining" courses. I teach bloom filters in my algorithms class, for example.
– Suresh Venkat
Nov 12, 2013 at 22:45
As a personal experience, LSH did not scale for us on an instance of a "big data" (100mln items).
– lynxoid
Nov 14, 2013 at 21:45
1

@lynxoid that's a separate discussion/question :). There are enough examples of where it does work that I think it's relevant to this particular question.
– Suresh Venkat
Nov 14, 2013 at 22:04

Add a comment |

Community · Accepted Answer · 2017-04-13 12:32:30Z

34

I would mention the widely-used software CPLEX (or similar) implementation of the Simplex method/algorithm for solving linear programming problems. It is the (?) most used algorithm in economy and operations research.

"If one would take statistics about which mathematical problem is using up most of the computer time in the world, then (not counting database handling problems like sorting and searching) the answer would probably be linear programming." (L. Lovász, A new linear programming algorithm-better or worse than the simplex method? Math. Intelligencer 2 (3) (1979/80) 141-146.)

The Simplex algorithm has also great influence in theory; see, for instance, the (Polynomial) Hirsch Conjecture.

I guess a typical undergraduate or Ph.D. class in algorithms deals with the Simplex algorithm (including basic algorithms from linear algebra like Gauss Elimination Method).

(Other successful algorithms, including Quicksort for sorting, are listed in Algorithms from the Book.)

edited Apr 13, 2017 at 12:32

CommunityBot

1

answered Nov 12, 2013 at 17:03

vb le

4,8281 gold badge37 silver badges46 bronze badges

"economy and operations research" isn't specific enough. CPLEX isn't the type of example I was looking for either, as it is just an implementation of the algorithm; it would be different if, say, the gcc compiler used the simplex method.
– Manu
Nov 12, 2013 at 18:47
12

I think "linear programming problems" is specific enough when we speak about economy and OR. Also, by CPLEX I meant the algorithm behind the implementation.
– vb le
Nov 12, 2013 at 19:12
16

"Today, most large firms use linear programming to price products and manage supply chains. Transportation firms use it to choose the cheapest way to consolidate, coordinate and route shipments of many products from globally distributed suppliers to distant markets subject to capacity constraints. The petroleum industry uses it for exploration, blending, production scheduling and distribution. The iron and steel industry uses it to evaluate iron ores, explore the addition of coke ovens and select products..." news.stanford.edu/news/2005/may25/dantzigobit-052505.html
– Sasho Nikolov
Nov 12, 2013 at 19:22
Thanks. But I find the quote terribly vague. I think if I say that in front of a class of students, half of it would fall asleep ;-) It would be different if we say something like: UPS uses LP to ship packages as follows... I am not saying such examples are trivial to find, but given that "most large firms use LP" I'd hope we can at least point to one.
– Manu
Nov 13, 2013 at 11:13
10

Off the top of my head, since 2007 LAX (the airport) has used software for solving Stackelberg games to schedule the security personnel. Solving large LPs is part of the whole thing, see e.g. teamcore.usc.edu/ARMOR-LAX. Besides that, I would ask someone from your Operations Research department: they would usually have plenty of war stories about using LP in real life
– Sasho Nikolov
Nov 13, 2013 at 16:39

Add a comment |

Huck Bennett · Accepted Answer · 2013-11-12 16:47:22Z

PageRank is one of the best-known such algorithms. Developed by Google co-founder Larry Page and co-authors, it formed the basis of Google's original search engine and is widely credited with helping them to achieve better search results than their competitors at the time.

We imagine a "random surfer" starting at some webpage, and repeatedly clicking a random link to take him to a new page. The question is, "What fraction of the time will the surfer spend at each page?" The more time the surfer spends at a page, the more important the page is considered.

More formally, we view the internet as a graph where pages are nodes and links are directed edges. We can then model the surfer's action as a random walk on a graph or equivalently as a Markov Chain with transition matrix $M$ . After dealing with some issues to ensure that the Markov Chain is ergodic (where does the surfer go if a page has no outgoing links?), we compute the amount of time the surfer spends at each page as the steady state distribution of the Markov Chain.

The algorithm itself is in some sense trivial - we just compute $M^{k} π_{0}$ for large $k$ and arbitrary initial distribution $π_{0}$ . This just amounts to repeated matrix-matrix or matrix-vector multiplication. The algorithms content is mainly in the set-up (ensuring ergodicity, proving that an ergodic Markov Chain has a unique steady state distribution) and convergence analysis (dependence on the spectral gap of $M$ ).

Incidentally I first learned about PageRank in an algorithms class. In fact, I think the professor chose it because it was a nice example of "algorithms used in practice." If you limit examples to "first half of CLRS" type material, the list of examples will either be too long or too trivial -- quicksort, B-trees, and Dijkstra's algorithm are ubiquitous. — Huck Bennett, Nov 12, 2013 at 19:11
@EmanueleViola I am currently an information science student at a university. I have not taken algorithms but I have had Markov chains and PageRank in other classes — ford prefect, Nov 12, 2013 at 20:09
I teach it to undergraduates also (both in the required algorithms class and in a more specialized graph algorithms elective). — David Eppstein, Nov 13, 2013 at 6:53
Alright, PageRank should qualify even though it's not in standard textbooks (I think). Though the algorithm is a bit atypical (when compared to other core algorithms), because as Huck says it's pretty much all in the analysis. — Manu, Nov 13, 2013 at 11:19
Isn't pagerank supposed to be based on the number of backlinks from other pages? You say it is based on the time that a user spends on a page; which is data a search engine cannot have. — Anirudh Ramanathan, Dec 3, 2013 at 6:17
Here "time" means the fraction of steps in the Markov Chain that the user spends at the page. — Huck Bennett, Dec 3, 2013 at 15:00
You forgot to mention the father of PageRank: the Hub-Authority algorithm. In my opinion, it's a far more commonly studied algorithm, because it's simpler and the subsequent algorithms like PageRank can be easily derived as enhancements. — gaborous, May 11, 2015 at 17:55

Chandra Chekuri · Accepted Answer · 2013-11-12 15:56:23Z

12

Check out Jens Vygen's project BonnTools for Chip Design. http://www.or.uni-bonn.de/~vygen/projects.html I have heard some talks on this and also looked at some of their papers. They use Raghavan-Thompson style randomized rounding as well as multiplicative weight update method for solving large-scale multicommodity flow LPs. However, like any big project, they need to do some engineering as well but the methodology is very much based on well-known algorithms.

answered Nov 12, 2013 at 15:56

Chandra Chekuri

6,97931 silver badges39 bronze badges

I'll have a look but it doesn't sound like typical algorithms material.
– Manu
Nov 12, 2013 at 18:50
8

Hmm, randomized rounding is typically taught in PhD level algorithms courses, no?
– Chandra Chekuri
Nov 12, 2013 at 20:44
2

Why just randomized rounding? Sanjeev Arora, Elad Hazan and Satyen Kale think that even mulplicative weights update method is basic enough to be taught at UG level :) "We feel that our meta algorithm and its analysis are simple and useful enough that they should be viewed as a basic tool taught to all algorithms students together with divide-and-conquer, dynamic programming, random sampling, and the like." (cf. cs.princeton.edu/~arora/pubs/MWsurvey.pdf).
– Jagadish
Nov 14, 2013 at 6:07

Add a comment |

Core algorithms deployed

Update:

29 Answers 29

Basic Data Structures and Algorithms in the Linux kernel

Data Structures and Algorithms in the Chromium Web Browser

Programming Language Libraries

Allocation and Scheduling Algorithms

Core utils in *nix systems

Cryptographic Algorithms

Compilers

Compression and Image Processing

Conflict Driven Clause Learning

Not the answer you're looking for? Browse other questions tagged
ds.algorithms
big-picture
application-of-theory
or ask your own question.

Linked

Hot Network Questions

Core algorithms deployed

Update:

29 Answers 29

Basic Data Structures and Algorithms in the Linux kernel

Data Structures and Algorithms in the Chromium Web Browser

Programming Language Libraries

Allocation and Scheduling Algorithms

Core utils in *nix systems

Cryptographic Algorithms

Compilers

Compression and Image Processing

Conflict Driven Clause Learning

Not the answer you're looking for? Browse other questions tagged ds.algorithmsbig-pictureapplication-of-theory or ask your own question.

Linked

Related

Hot Network Questions

Not the answer you're looking for? Browse other questions tagged
ds.algorithms
big-picture
application-of-theory
or ask your own question.