EdX

String Processing and Pattern Matching Algorithms (edX)

String Processing and Pattern Matching Algorithms (edX)

Learn about pattern matching and string processing algorithms and how they apply to interesting applications. The world and internet are full of textual information. We search for information using textual queries and read websites, books and e-mails.

Class Deals by MOOC List - Click here and see EdX's Active Discounts, Deals, and Promo Codes.

These are all strings from a computer science point of view. To make sense of all this information and make search efficient, search engines use many string algorithms. Moreover, the emerging field of personalized medicine uses many search algorithms to find disease-causing mutations in the human genome.
In this course, part of the Algorithms and Data Structures MicroMasters program, you will learn about:

  • suffix trees;
  • suffix arrays;
  • how other brilliant algorithmic ideas help doctors to find differences between genomes;
  • power lightning-fast Internet searches.

What you'll learn

  • Key ideas for pattern matching and suffix trees
  • Suffix arrays

-Burrows-Wheeler Transform for compression

  • Applications of string algorithms in bioinformatics

Prerequisites
Basic knowledge of at least one programming language. Successful completion of the Algorithmic Toolbox and Data Structures courses.

Course Syllabus

Weeks 1 and 2: Suffix Trees
How would you search for a longest repeat in a string in LINEAR time? In 1973, Peter Weiner came up with a surprising solution that was based on suffix trees, the key data structure in pattern matching. Computer scientists were so impressed with his algorithm that they called it the Algorithm of the Year. In this lesson, we will explore some key ideas for pattern matching that will - through a series of trials and errors - bring us to suffix trees.

Week 3 and 4: Burrows-Wheeler Transform and Suffix Arrays
Although EXACT pattern matching with suffix trees is fast, it is not clear how to use suffix trees for APPROXIMATE pattern matching. In 1994, Michael Burrows and David Wheeler invented an ingenious algorithm for text compression that is now known as Burrows-Wheeler Transform. They knew nothing about genomics, and they could not have imagined that 15 years later their algorithm will become the workhorse of biologists searching for genomic mutations. But what text compression has to do with pattern matching??? In this lesson you will learn that the fate of an algorithm is often hard to predict – its applications may appear in a field that has nothing to do with the original plan of its inventors.

Go to Class
MOOC List is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Related Courses

AP Computer Science A: Java Programming Polymorphism and Advanced Data Structures (edX) EdX
Purdue University,PurdueX

AP Computer Science A: Java Programming Polymorphism and Advanced Data Structures (edX)

AP Computer Science A from Purdue University. This computer science course covers advanced OOP strategies, including polymorphism, abstract classes, super keyword, exceptions, generics, sorting and searching algorithms. This course is for anyone interested in taking a first-level computer-programming course, particularly those who attend a school that does not provide a similar class.

This course is archived
5-12 Weeks
Advanced Algorithmics and Graph Theory with Python (edX) EdX
Institut Mines-Telecom,IMTx

Advanced Algorithmics and Graph Theory with Python (edX)

Strengthen your skills in algorithmics and graph theory, and gain experience in programming in Python along the way. Algorithmics and programming are fundamental skills for engineering students, data scientists and analysts, computer hobbyists or developers. Learning how to program algorithms can be tedious if you aren’t given an opportunity to immediately practice what you learn. In this course, you won't just focus on theory or study a simple catalog of methods, procedures, and concepts. Instead, you’ll be given a challenge wherein you'll be asked to beat an algorithm we’ve written for you by coming up with your own clever solution.

Sep 4th 2023
5-12 Weeks
Data Structures and Algorithm Design Part I | 数据结构与算法设计(上) (edX) EdX
Tsinghua University,TsinghuaX

Data Structures and Algorithm Design Part I | 数据结构与算法设计(上) (edX)

Learn the basics of data structures and methods to design algorithms and analyze their performance. 本课程旨在围绕各类数据结构的设计与实现,揭示其中的规律原理与方法技巧;同时针对算法设计及其性能分析,使学生了解并掌握主要的套路与手段。

Self Paced
Self-Paced
Data Structures & Algorithms I: ArrayLists, LinkedLists, Stacks and Queues (edX) EdX
Georgia Institute of Technology,GTx

Data Structures & Algorithms I: ArrayLists, LinkedLists, Stacks and Queues (edX)

Work with the principles of data storage in Arrays, ArrayLists & LinkedList nodes. Understand their operations and performance with visualizations. Implement low-level linear, linked data structures with recursive methods, and explore their edge cases. Extend these structures to the Abstract Data Types, Stacks, Queues and Deques.

Self Paced
Self-Paced
Sparse Representations in Signal and Image Processing: Fundamentals (edX) EdX
IsraelX

Sparse Representations in Signal and Image Processing: Fundamentals (edX)

Learn about the field of sparse representations by understanding its fundamental theoretical and algorithmic foundations. This course introduces the fundamentals of the field of sparse representations, starting with its theoretical concepts, and systematically presenting its key achievements. We will touch on theory and numerical algorithms.

Self Paced
Self-Paced
Dynamic Programming: Applications In Machine Learning and Genomics (edX) EdX
University of California, San Diego,UC San DiegoX

Dynamic Programming: Applications In Machine Learning and Genomics (edX)

Learn how dynamic programming and Hidden Markov Models can be used to compare genetic strings and uncover evolution. If you look at two genes that serve the same purpose in two different species, how can you rigorously compare these genes in order to see how they have evolved away from each other?

Self Paced
Self-Paced
Computing in Python IV: Objects & Algorithms (edX) EdX
Georgia Institute of Technology,GTx

Computing in Python IV: Objects & Algorithms (edX)

Learn about recursion, search and sort algorithms, and object-oriented programming in Python. Complete your introductory knowledge of computer science with this final course on objects and algorithms. Now that you've learned about complex control structures and data structures, learn to develop programs that more intuitively leverage your natural understanding of problems through object-oriented programming. Then, learn to analyze the complexity and efficiency of these programs through algorithms. In addition, certify your broader knowledge of Introduction to Computing with a comprehensive exam.

Self Paced
Self-Paced
Autonomous Mobile Robots (edX) EdX
ETH Zurich,ETHx

Autonomous Mobile Robots (edX)

Basic concepts and algorithms for locomotion, perception, and intelligent navigation. Robots are rapidly evolving from factory workhorses, which are physically bound to their work-cells, to increasingly complex machines capable of performing challenging tasks in our daily environment. The objective of this course is to provide the basic concepts and algorithms required to develop mobile robots that act autonomously in complex environments.

Self Paced
Self-Paced