• Natural Language Processing in the Browser
  • Originally written by Martin Novak
  • The Nuggets translation Project
  • Permanent link to this article: github.com/xitu/gold-m…
  • Translator: regon – cao
  • Proofread by zenblo NieZhuZhu Lsvih

It is also possible to build chatbots for their own websites without relying on third-party services and servers such as Dialogflow. I’ll show you how to build a chatbot that runs completely in the browser.

I assume you have some understanding of how JavaScript and natural language processing work. No other advanced knowledge or machine learning experience is required to accomplish this task.

If anyone tells you that using JavaScript for machine learning in your browser is crazy, ignore them, because you’ll soon know how to do it.

Our code is based on nLP.js version 4. NLP is an open source library written in JavaScript for natural language processing. This project allows you to train NLP models directly from the corpus in the browser and add a hook to change the answer programmatically.

The full project is at my GitHub repository: github.com/MeetMartin/… . You can download and open index.html to play with the chatbot.

Every real developer today should have some experience in ai development. What could be more sci-fi than talking to your computer with something you’ve developed yourself?

Installing software Packages

Create a new NPM project and install the NLP package in any folder:

npm i -D @nlpjs/core @nlpjs/lang-en-min @nlpjs/nlp @nlpjs/request-rn@nlpjs/request-rn
Copy the code

We also need to install Browserify and Terser to build NLP packages for the browser:

npm i -D browserify terser
Copy the code

Enjoy the joy of a new project that has just installed a package. You deserve it.

Build the NLP package

The first step is to build NLP packages using Browserify and Terser. To do this, we just need to create the base code in buildable. Js:

const core = require('@nlpjs/core');
const nlp = require('@nlpjs/nlp');
const langenmin = require('@nlpjs/lang-en-min');
const requestrn = require('@nlpjs/request-rn');

window.nlpjs = { ... core, ... nlp, ... langenmin, ... requestrn };Copy the code

We only use NLP core code and a small English language package. To build everything, just add the following command to your package.json:

{
  "name": "nlpjs-web"."version": "1.0.0"."scripts": {
    "build": "browserify ./buildable.js | terser --compress --mangle > ./dist/bundle.js",},"devDependencies": {
    "@nlpjs/core": "^ 4.14.0"."@nlpjs/lang-en-min": "^ 4.14.0"."@nlpjs/nlp": "^ 4.15.0"."@nlpjs/request-rn": "^ 4.14.3"."browserify": "^ 17.0.0"."terser": "^ 5.3.8"}}Copy the code

Now run the following command:

npm run build
Copy the code

You should get about 137 KB of./dist/bundle.js. It’s worth noting that the NLP package has an excellent list of supported languages. However, a browser-optimized version is available only in English.

Train the NLP model in the browser

Now that we have created the package, we can train the NLP model in the browser. Create index. HTML:

<html>
<head>
    <title>NLP in a browser</title>
    <script src='./dist/bundle.js'></script>
    <script>
        const {containerBootstrap, Nlp, LangEn, fs} = window.nlpjs;

        const setupNLP = async corpus => {
            const container = containerBootstrap();
            container.register('fs', fs);
            container.use(Nlp);
            container.use(LangEn);
            const nlp = container.get('nlp');
            nlp.settings.autoSave = false;
            await nlp.addCorpus(corpus);
            nlp.train();
            return nlp;
        };

        (async() = > {const nlp = await setupNLP('https://raw.githubusercontent.com/jesus-seijas-sp/nlpjs-examples/master/01.quickstart/02.filecorpus/corpus-en.json'); }) ();</script>
</head>
<body>
    <h1>NLP in a browser</h1>
    <div id="chat"></div>
    <form id="chatbotForm">
        <input type="text" id="chatInput" />
        <input type="submit" id="chatSubmit" value="send" />
    </form>
</body>
</html>
Copy the code

The function setupNLP is responsible for the installation and training of the processing library. A corpus is a JSON file that defines a chatbot conversation in the following format:

{
  "name": "Corpus"."locale": "en-US"."data": [{"intent": "agent.acquaintance"."utterances": [
        "say about you"."why are you here"."what is your personality"."describe yourself"."tell me about yourself"."tell me about you"."what are you"."who are you"."I want to know more about you"."talk about yourself"]."answers": [
        "I'm a virtual agent"."Think of me as a virtual agent"."Well, I'm not a person, I'm a virtual agent"."I'm a virtual being, not a real person"."I'm a conversational app"] {},"intent": "agent.age"."utterances": [
        "your age"."how old is your platform"."how old are you"."what's your age"."I'd like to know your age"."tell me your age"]."answers": [
        "I'm very young"."I was created recently"."Age is just a number. You're only as old as you feel"]]}}Copy the code

The intent is a unique identifier of the session node, and its value should represent the intent of the user talking to the robot. Utterances is a series of training samples on what the user can say to trigger an intent. Answers is a set of randomly selected Answers from which the chatbot can choose.

In order to training the bot, we from raw.githubusercontent.com/jesus-seija… Borrow a larger corpus. You can also feel free to create your own corpus for your use cases. Just keep in mind that the NLP library needs to read the corpus from the URL.

When you open index.html in your browser, you should see a simple chatbot that does nothing.

But if you open the browser console, you can already see the successful training output:

The training is so fast that your chatbot can use the trained model in a browser. Corpus files are a more efficient approach because they are much smaller than the generated models.

It feels great to train your first machine learning code. You just became a legend, one of the few people on the planet who can say, “Yes, I trained an AI once, no big deal. “One of the people.

Chatbot HTML

Let’s get this chatbot form working. Add the onChatSubmit function to your index.html:

<html>
<head>
    <title>NLP in a browser</title>
    <script src='./dist/bundle.js'></script>
    <script>
        const {containerBootstrap, Nlp, LangEn, fs} = window.nlpjs;

        const setupNLP = async corpus => {
            const container = containerBootstrap();
            container.register('fs', fs);
            container.use(Nlp);
            container.use(LangEn);
            const nlp = container.get('nlp');
            nlp.settings.autoSave = false;
            await nlp.addCorpus(corpus);
            nlp.train();
            return nlp;
        };

        const onChatSubmit = nlp= > async event => {
            event.preventDefault();
            const chat = document.getElementById('chat');
            const chatInput = document.getElementById('chatInput');
            chat.innerHTML = chat.innerHTML + `<p>you: ${chatInput.value}</p>`;
            const response = await nlp.process('en', chatInput.value);
            chat.innerHTML = chat.innerHTML + `<p>chatbot: ${response.answer}</p>`;
            chatInput.value = ' ';
        };

        (async() = > {const nlp = await setupNLP('https://raw.githubusercontent.com/jesus-seijas-sp/nlpjs-examples/master/01.quickstart/02.filecorpus/corpus-en.json');
            const chatForm = document.getElementById('chatbotForm');
            chatForm.addEventListener('submit', onChatSubmit(nlp)); }) ();</script>
</head>
<body>
<h1>NLP in a browser</h1>
<div id="chat"></div>
<form id="chatbotForm">
    <input type="text" id="chatInput" />
    <input type="submit" id="chatSubmit" value="send" />
</form>
</body>
</html>
Copy the code

Now you can interact with chatbots:

Check your corpus or raw.githubusercontent.com/jesus-seija… Learn which conversations are supported.

Now, this is something you can show your friends and easily get their envy. Now you’re a real hacker.

Add hooks to Intents

You might want your chatbot to be able to append some code to each intent, or substitute some API calls for some intent answers. Let’s extend index.html to the final version.

<html>
<head>
    <title>NLP in a browser</title>
    <script src='./dist/bundle.js'></script>
    <script>
        const {containerBootstrap, Nlp, LangEn, fs} = window.nlpjs;

        function onIntent(nlp, input) {
            console.log(input);
            if (input.intent === 'greetings.hello') {
                const hours = new Date().getHours();
                const output = input;
                if(hours < 12) {
                    output.answer = 'Good morning! ';
                } else if(hours < 17) {
                    output.answer = 'Good afternoon! ';
                } else {
                    output.answer = 'Good evening! ';
                }
                return output;
            }
            return input;
        }

        const setupNLP = async corpus => {
            const container = containerBootstrap();
            container.register('fs', fs);
            container.use(Nlp);
            container.use(LangEn);
            const nlp = container.get('nlp');
            nlp.onIntent = onIntent;
            nlp.settings.autoSave = false;
            await nlp.addCorpus(corpus);
            nlp.train();
            return nlp;
        };

        const onChatSubmit = nlp= > async event => {
            event.preventDefault();
            const chat = document.getElementById('chat');
            const chatInput = document.getElementById('chatInput');
            chat.innerHTML = chat.innerHTML + `<p>you: ${chatInput.value}</p>`;
            const response = await nlp.process('en', chatInput.value);
            chat.innerHTML = chat.innerHTML + `<p>chatbot: ${response.answer}</p>`;
            chatInput.value = ' ';
        };

        (async() = > {const nlp = await setupNLP('https://raw.githubusercontent.com/jesus-seijas-sp/nlpjs-examples/master/01.quickstart/02.filecorpus/corpus-en.json');
            const chatForm = document.getElementById('chatbotForm');
            chatForm.addEventListener('submit', onChatSubmit(nlp)); }) ();</script>
</head>
<body>
<h1>NLP in a browser</h1>
<div id="chat"></div>
<form id="chatbotForm">
    <input type="text" id="chatInput" />
    <input type="submit" id="chatSubmit" value="send" />
</form>
</body>
</html>
Copy the code

We added a line to the setupNLP function:

nlp.onIntent = onIntent;
Copy the code

We create an onIntent function. Note that onIntent prints the corresponding object for each session in the console. Added logic in greetings. Hello intent to replace the output with the current user’s time as a reply. In my example it’s afternoon:

Isn’t it great? If you’re ready to start your own AI startup, let’s give it a high-five!

Known limitations

Note that the browser version of the NLP library does not support some common natural language processing features, such as named entities or entity extraction, available in the full library.

NLP as a library does not currently support scene or flow control dialogues. These are all part of the process chatbot currently being developed, but at the time of this writing, the feature is still experimental.

Security and privacy considerations

When using this scheme, keep in mind that the entire corpus and its features are available in a browser to anyone who visits your site. It also makes it easy for anyone to download, manipulate, and use your corpus. Make sure your robot doesn’t reveal any personal information.

There are some advantages to adopting a browser-only solution, but there are some missing opportunities, and you still need some back-end solution to be able to record what users are talking about using a chatbot. Also, if you record entire conversations, consider privacy concerns, especially in the context of laws like the GDPR.

If you find any mistakes in your translation or other areas that need to be improved, you are welcome to the Nuggets Translation Program to revise and PR your translation, and you can also get the corresponding reward points. The permanent link to this article at the beginning of this article is the MarkDown link to this article on GitHub.


The Nuggets Translation Project is a community that translates quality Internet technical articles from English sharing articles on nuggets. The content covers Android, iOS, front-end, back-end, blockchain, products, design, artificial intelligence and other fields. If you want to see more high-quality translation, please continue to pay attention to the Translation plan of Digging Gold, the official Weibo, Zhihu column.