HomeLeetcode187. Repeated DNA Sequences - Leetcode Solutions

187. Repeated DNA Sequences – Leetcode Solutions

Description

The DNA sequence is composed of a series of nucleotides abbreviated as 'A''C''G', and 'T'.

  • For example, "ACGAATTCCG" is a DNA sequence.

When studying DNA, it is useful to identify repeated sequences within the DNA.

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.

Examples:

Example 1:

Input: s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT"
Output: ["AAAAACCCCC","CCCCCAAAAA"]
Example 2:

Input: s = "AAAAAAAAAAAAA"
Output: ["AAAAAAAAAA"]

Solution in Python

Python
from typing import List
from collections import defaultdict

class Solution:
    def findRepeatedDnaSequences(self, s: str) -> List[str]:
        # Initialize a dictionary to count occurrences of each 10-letter sequence
        sequence_count = defaultdict(int)
        # List to store the sequences that are repeated
        repeated_sequences = []

        # Iterate through the string, extracting each 10-letter sequence
        for i in range(len(s) - 9):
            # Extract the current 10-letter sequence
            sequence = s[i:i+10]
            # Increase the count of this sequence in the dictionary
            sequence_count[sequence] += 1

            # If the count becomes 2, it means this sequence has been seen more than once
            # Add it to the result list (only add once)
            if sequence_count[sequence] == 2:
                repeated_sequences.append(sequence)

        # Return the list of repeated sequences
        return repeated_sequences

Explanation:

  1. Imports and Initialization:
    • from typing import List: Import the List type hint from the typing module.
    • from collections import defaultdict: Import defaultdict from the collections module to simplify counting occurrences.
  2. Class and Method Definition:
    • class Solution:: Define a class named Solution.
    • def findRepeatedDnaSequences(self, s: str) -> List[str]:: Define a method findRepeatedDnaSequences that takes a string s and returns a list of strings.
  3. Counting Sequences:
    • sequence_count = defaultdict(int): Initialize a defaultdict to store and count each 10-letter sequence.
    • repeated_sequences = []: Initialize an empty list to store sequences that occur more than once.
  4. Iterating Through the String:
    • for i in range(len(s) - 9):: Iterate over the string s, stopping at len(s) - 9 to ensure we can extract a 10-letter sequence.
    • sequence = s[i:i+10]: Extract the current 10-letter sequence starting at index i.
  5. Updating the Count and Checking for Repeats:
    • sequence_count[sequence] += 1: Increase the count of this sequence in the dictionary.
    • if sequence_count[sequence] == 2:: If the count becomes 2, it means this sequence has been seen more than once, so we add it to the result list.
  6. Return the Result:
    • return repeated_sequences: Return the list of sequences that occurred more than once.

Time Complexity:

  • The time complexity is O(n), where n is the length of the string s. This is because we iterate over the string once and perform constant-time operations for each 10-letter sequence.

Space Complexity:

  • The space complexity is O(n) in the worst case, where every possible 10-letter sequence is unique and stored in the dictionary.

This code efficiently finds all 10-letter-long sequences that occur more than once in the given DNA sequence string s.

Solution in Javascript

JavaScript
/**
 * @param {string} s
 * @return {string[]}
 */
var findRepeatedDnaSequences = function(s) {
    // Dictionary to keep track of occurrences of each 10-letter sequence
    const sequenceCount = new Map();
    // Array to store sequences that are repeated
    const repeatedSequences = [];

    // Loop through the string and extract each 10-letter sequence
    for (let i = 0; i <= s.length - 10; i++) {
        // Extract the current 10-letter sequence
        const sequence = s.substring(i, i + 10);
        
        // If the sequence already exists in the map, increment its count
        if (sequenceCount.has(sequence)) {
            sequenceCount.set(sequence, sequenceCount.get(sequence) + 1);
        } else {
            // If the sequence is new, add it to the map with a count of 1
            sequenceCount.set(sequence, 1);
        }

        // If the count becomes 2, it means this sequence has been seen more than once
        // We only add it to the result array the first time this condition is true
        if (sequenceCount.get(sequence) === 2) {
            repeatedSequences.push(sequence);
        }
    }

    // Return the array of repeated sequences
    return repeatedSequences;
};

Solution in Java

Java
class Solution {
    public List<String> findRepeatedDnaSequences(String s) {
        // Map to keep track of occurrences of each 10-letter sequence
        Map<String, Integer> sequenceCount = new HashMap<>();
        // List to store sequences that are repeated
        List<String> repeatedSequences = new ArrayList<>();

        // Iterate through the string and extract each 10-letter sequence
        for (int i = 0; i <= s.length() - 10; i++) {
            // Extract the current 10-letter sequence
            String sequence = s.substring(i, i + 10);

            // If the sequence already exists in the map, increment its count
            sequenceCount.put(sequence, sequenceCount.getOrDefault(sequence, 0) + 1);

            // If the count becomes 2, it means this sequence has been seen more than once
            // Add it to the result list (only add once when count is exactly 2)
            if (sequenceCount.get(sequence) == 2) {
                repeatedSequences.add(sequence);
            }
        }

        // Return the list of repeated sequences
        return repeatedSequences;
    }
}

Solution in C#

C#
using System;
using System.Collections.Generic;

public class Solution {
    public IList<string> FindRepeatedDnaSequences(string s) {
        // Dictionary to keep track of occurrences of each 10-letter sequence
        Dictionary<string, int> sequenceCount = new Dictionary<string, int>();
        // List to store sequences that are repeated
        IList<string> repeatedSequences = new List<string>();

        // Iterate through the string and extract each 10-letter sequence
        for (int i = 0; i <= s.Length - 10; i++) {
            // Extract the current 10-letter sequence
            string sequence = s.Substring(i, 10);

            // If the sequence already exists in the dictionary, increment its count
            if (sequenceCount.ContainsKey(sequence)) {
                sequenceCount[sequence]++;
            } else {
                // If the sequence is new, add it to the dictionary with a count of 1
                sequenceCount[sequence] = 1;
            }

            // If the count becomes 2, it means this sequence has been seen more than once
            // Add it to the result list (only add once)
            if (sequenceCount[sequence] == 2) {
                repeatedSequences.Add(sequence);
            }
        }

        // Return the list of repeated sequences
        return repeatedSequences;
    }
}

Subscribe
Notify of

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments

Popular