“Offer comes, ask friends to take it! I am participating in the 2022 Spring Recruit Punch card campaign. Click here for more details.”


📢 preface

🚀 Algorithm 🚀
  • 🌲 punch in an algorithm every day, which is not only a learning process, but also a sharing process 😜
  • 🌲 tip: the programming languages used in this column are C# and Java
  • 🌲 to maintain a state of learning every day, let us work together to become a god of algorithms 🧐!
  • 🌲 today is the force button algorithm continued to punch the card 97 days 🎈!
🚀 Algorithm 🚀

🌲 The most common words

Given a paragraph (paragraph) and a list of banned words (banned). Returns the word that occurs most often and is not on the banned list.

The questions ensure that at least one word is not on the banned list and that the answer is unique.

Words on the banned list are written in lower case, without punctuation. Words in paragraphs are case insensitive. The answers are all lowercase letters.

Example:

Input: paragraph ="Bob hit a ball, the hit BALL flew far after it was hit."
banned = ["hit"] output:"ball"Explanation:"hit"the3But it is a banned word."ball"the2No other words appear at the same time2"), so it's the most frequently used word in the paragraph and not on the banned list. Note that all of these words are case insensitive in paragraphs, and punctuation should be ignored (even if it's next to the word, e.g"ball,"),"hit"Not the final answer, although it appears more often, but it's on the list of banned words.Copy the code

Tip:

  • 1 <= Paragraph length <= 1000
  • 0 <= Number of prohibited words <= 100
  • 1 <= Disable word length <= 10
  • The answers are unique and all in lower case (even if they are uppercase in paragraph, even for certain nouns, the answers are all in lower case).
  • Paragraph contains only letters, Spaces, and the following punctuation marks! ? ‘,; .
  • There are no unhyphenated or hyphenated words.
  • Words contain only letters, no ellipses or other punctuation marks.

🌻C# method: dictionary

Use a dictionary to process the data, and finally determine whether it belongs to the banned words!

Code:

public class Solution {
    public string MostCommonWord(string paragraph, string[] banned) {

            paragraph =paragraph.ToLower();
            string str = "";
            Dictionary<string.int> dic = new Dictionary<string.int> ();for (int i = 0; i < paragraph.Length; i++)
            {
                if (paragraph[i]-'a'<0|| paragraph[i] -'a' >26)
                {
                    if (string.IsNullOrEmpty(str))
                    {
                        continue;
                    }
                    else
                    {
                        if (dic.ContainsKey(str))
                        {
                            dic[str]++;
                        }
                        else
                        {
                            dic.Add(str,1);
                        }
                        str = ""; }}else{ str += paragraph[i].ToString(); }}if(str! =""&&dic.ContainsKey(str))
             {
                dic[str]++;
             }
                else
             {
                dic.Add(str,1);
             }
               str = "";            
            int num = int.MinValue;
            string res = "";
            foreach (var item in dic)
            {
                if (banned.Contains(item.Key))
                {
                    continue;
                }
                else
                {
                    if(item.Value>num) { num = item.Value; res = item.Key; }}}returnres; }}Copy the code

The execution result

By execution time:112Ms, in all CDefeated 58.00% of users in # submitMemory consumption:39.9MB, in all CBeat 58.33% of users in # submit
Copy the code

🌻Java method: Simple counting

We count the number of occurrences of each word, ignoring all punctuation and case, and the answer is the word that appears most often and is not on the banned list.

There are two ways to count words. In the first method, we first split the entire paragraph by space, then for each word we split, we remove punctuation and ignore case. In the second method, we scan the entire paragraph character by character, and if we encounter a non-letter symbol, we treat the previously encountered letter as a word.

For each word, we count it in a HashMap (HashMap in Java or Counter in Python). After each word is added, we can update the answer once if the word is not on the banned list.

Code:

class Solution {
    public String mostCommonWord(String paragraph, String[] banned) {
        paragraph += ".";

        Set<String> banset = new HashSet();
        for (String word: banned) banset.add(word);
        Map<String, Integer> count = new HashMap();

        String ans = "";
        int ansfreq = 0;

        StringBuilder word = new StringBuilder();
        for (char c: paragraph.toCharArray()) {
            if (Character.isLetter(c)) {
                word.append(Character.toLowerCase(c));
            } else if (word.length() > 0) {
                String finalword = word.toString();
                if(! banset.contains(finalword)) { count.put(finalword, count.getOrDefault(finalword,0) + 1);
                    if (count.get(finalword) > ansfreq) {
                        ans = finalword;
                        ansfreq = count.get(finalword);
                    }
                }
                word = newStringBuilder(); }}returnans; }}Copy the code

The execution result

By execution time:5Ms, beat out all Java commits98.76% user memory consumption:38.2MB, beat out all Java commits88.29% of the userCopy the code

Complexity analysis

Time complexity: O(P+B) Space complexity: O(P+B)Copy the code

💬 summary

  • Today is the ninety-seventh day of clocking!
  • The article USES theC# andJavaTwo programming languages to solve the problem
  • Some methods are also written by the god of reference force buckle, and they are also shared while learning, thanks again to the algorithm masters
  • That’s the end of today’s algorithm sharing, see you tomorrow!