Algorithms on Strings (Coursera)

Algorithms on Strings (Coursera)

World and internet is full of textual information. We search for information using textual queries, we read websites, books, e-mails. All those are strings from the point of view of computer science. To make sense of all that information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.

Class Deals by MOOC List - Click here and see Coursera's Active Discounts, Deals, and Promo Codes.

Course 4 of 6 in the Data Structures and Algorithms Specialization.

Syllabus

WEEK 1
Suffix Trees
How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will - through a series of trials and errors - bring us to suffix trees.

WEEK 2
Burrows-Wheeler Transform and Suffix Arrays
Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.

WEEK 3
Knuth–Morris–Pratt Algorithm
Congratulations, you have now learned the key pattern matching concepts: tries, suffix trees, suffix arrays and even the Burrows-Wheeler transform! However, some of the results Pavel mentioned remain mysterious: e.g., how can we perform exact pattern matching in O(|Text|) time rather than in O(|Text|*|Pattern|) time as in the naïve brute force algorithm? How can it be that matching a 1000-nucleotide pattern against the human genome is nearly as fast as matching a 3-nucleotide pattern??? Also, even though Pavel showed how to quickly construct the suffix array given the suffix tree, he has not revealed the magic behind the fast algorithms for the suffix tree construction!In this module, Miсhael will address some algorithmic challenges that Pavel tried to hide from you :) such as the Knuth-Morris-Pratt algorithm for exact pattern matching and more efficient algorithms for suffix tree and suffix array construction.

WEEK 4
Constructing Suffix Arrays and Suffix Trees
In this module we continue studying algorithmic challenges of the string algorithms. You will learn an O(n log n) algorithm for suffix array construction and a linear time algorithm for construction of suffix tree from a suffix array. You will also implement these algorithms and the Knuth-Morris-Pratt algorithm in the last Programming Assignment in this course.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

Practical Machine Learning (Coursera) Coursera
Johns Hopkins University

Practical Machine Learning (Coursera)

One of the most common tasks performed by data scientists and data analysts are prediction and machine learning. This course will cover the basic components of building and applying prediction functions with an emphasis on practical applications. The course will provide basic grounding in concepts such as training and tests sets, overfitting, and error rates.

Jun 22nd 2026
4 Weeks
Finding Hidden Messages in DNA (Bioinformatics I) (Coursera) Coursera
University of California, San Diego

Finding Hidden Messages in DNA (Bioinformatics I) (Coursera)

This course begins a series of classes illustrating the power of computing in modern biology. Please join us on the frontier of bioinformatics to look for hidden messages in DNA without ever needing to put on a lab coat. In the first half of the course, we investigate DNA replication, and ask the question, where in the genome does DNA replication begin? We will see that we can answer this question for many bacteria using only some straightforward algorithms to look for hidden messages in the genome.

Jun 22nd 2026
5-12 Weeks
Big Data Analysis with Scala and Spark (Coursera) Coursera
École Polytechnique Fédérale de Lausanne

Big Data Analysis with Scala and Spark (Coursera)

Manipulating big data distributed over a cluster using functional concepts is rampant in industry, and is arguably one of the first widespread industrial uses of functional ideas. This is evidenced by the popularity of MapReduce and Hadoop, and most recently Apache Spark, a fast, in-memory distributed collections framework written in Scala. In this course, we'll see how the data parallel paradigm can be extended to the distributed case, using Spark throughout.

Jun 22nd 2026
4 Weeks
Machine Learning: Regression (Coursera) Coursera
University of Washington

Machine Learning: Regression (Coursera)

Case Study - Predicting Housing Prices. In our first case study, predicting house prices, you will create models that predict a continuous value (price) from input features (square footage, number of bedrooms and bathrooms,...). This is just one of the many places where regression can be applied. Other applications range from predicting health outcomes in medicine, stock prices in finance, and power usage in high-performance computing, to analyzing which regulators are important for gene expression.

Jun 22nd 2026
5-12 Weeks
Crash Course on Python (Coursera) Coursera
Google

Crash Course on Python (Coursera)

This course is designed to teach you the foundations in order to write simple programs in Python using the most common structures. No previous exposure to programming is needed. By the end of this course, you'll understand the benefits of programming in IT roles; be able to write simple programs using Python; figure out how the building blocks of programming fit together; and combine all of this knowledge to solve a complex programming problem.

Jun 23rd 2026
5-12 Weeks
Data Structures and Design Patterns for Game Developers (Coursera) Coursera
University of Colorado System

Data Structures and Design Patterns for Game Developers (Coursera)

This course is the fourth course in the specialization about learning how to develop video games using the C# programming language and the Unity game engine on Windows or Mac. Why use C# and Unity instead of some other language and game engine? Well, C# is a really good language for learning how to program and then programming professionally. Also, the Unity game engine is very popular with indie game developers; Unity games were downloaded 16,000,000,000 times in 2016! Finally, C# is one of the programming languages you can use in the Unity environment.

Jun 22nd 2026
4 Weeks
Introduction to Google SEO (Coursera) Coursera
University of California, Davis

Introduction to Google SEO (Coursera)

Ever wonder how major search engines such as Google, Bing and Yahoo rank your website within their searches? Or how content such as videos or local listings are shown and ranked based on what the search engine considers most relevant to users? Welcome to the world of Search Engine Optimization (SEO). This course is the first within the SEO Specialization and it is intended to give you a taste of SEO with some fun practices to get seen in Google.

Jun 22nd 2026
4 Weeks
Unordered Data Structures (Coursera) Coursera
University of Illinois at Urbana-Champaign

Unordered Data Structures (Coursera)

The Unordered Data Structures course covers the data structures and algorithms needed to implement hash tables, disjoint sets and graphs. These fundamental data structures are useful for unordered data. For example, a hash table provides immediate access to data indexed by an arbitrary key value, that could be a number (such as a memory address for cached memory), a URL (such as for a web cache) or a dictionary.

Jun 24th 2026
4 Weeks
Digital Signal Processing 1: Basic Concepts and Algorithms (Coursera) Coursera
École Polytechnique Fédérale de Lausanne

Digital Signal Processing 1: Basic Concepts and Algorithms (Coursera)

Digital Signal Processing is the branch of engineering that, in the space of just a few decades, has enabled unprecedented levels of interpersonal communication and of on-demand entertainment. By reworking the principles of electronics, telecommunication and computer science into a unifying paradigm, DSP is a the heart of the digital revolution that brought us CDs, DVDs, MP3 players, mobile phones and countless other devices.

Jun 22nd 2026
4 Weeks