• Skip to main content
  • Skip to primary sidebar

Technical Notes Of
Ehi Kioya

Technical Notes Of Ehi Kioya

  • Forums
  • About
  • Contact
MENUMENU
  • Blog Home
  • AWS, Azure, Cloud
  • Backend (Server-Side)
  • Frontend (Client-Side)
  • SharePoint
  • Tools & Resources
    • CM/IN Ruler
    • URL Decoder
    • Text Hasher
    • Word Count
    • IP Lookup
  • Linux & Servers
  • Zero Code Tech
  • WordPress
  • Musings
  • More
    Categories
    • Cloud
    • Server-Side
    • Front-End
    • SharePoint
    • Tools
    • Linux
    • Zero Code
    • WordPress
    • Musings
Home » Tools & Resources » Online Word Counter [With Full Source Code]

Online Word Counter [With Full Source Code]

Last updated on January 3rd, 2020 at 06:41 am by Ehi Kioya 2 Comments

This page contains everything you need to perform (and understand) word and character counting: A fully functional online word counter, detailed explanation of the code that powers it, and a full dump of the final HTML, CSS, and JavaScript code that the tool uses.

First, here’s the functional word counter tool

Characters: 0
Words: 0
Sentences: 0
Paragraphs: 0
Reading Time: 0
Top keywords:

    The above word counter also features a keyword counter that displays the top 4 keywords in your text. This keyword counter becomes visible only after you have entered some text.

    How to build a word counter like this one

    In this section, I explain how you can build your own word counter that looks and works like mine above. If you just want to see the full code used for this tool, skip to this section.

    Note: All the counting done in this word counter rely heavily on regular expressions (or regex). So, you will need to have some understanding of regex if you’re planning to change the core behavior of this tool. But if you’re just reading this casually, then no solid knowledge of regular expressions is necessary.

    Getting user input

    We need something to capture user input. To do this, we will be using the HTML textarea element. Like this:

    <textarea placeholder="Enter your text here..."/>

    The JavaScript to select the above textarea will look like this:

    var input = document.querySelectorAll('textarea')[0];

    This assumes that our word counter textarea is the first (zero index) textarea on the webpage. If for some reason you have another textarea on your page that appears before the one you’re using for the word counter, adjust the array index accordingly.

    We will use input.value to grab the value of the text the user enters into the textarea.

    Since we want to automatically calculate word and character counts (and other results) as the user types, we need to execute some code when the keyup event is triggered. For this reason, the bulk of our JavaScript code is contained in a function that looks like this:

    input.addEventListener('keyup', function() {
    
      // All the logic used for the word counter, sentence counter, character counter, reading time calculator, and keyword finder
    
    });

    The outputs will be stored in simple HTML div elements that look like this:

    <div class="output row">
    	<div>Characters: <span id="characterCount">0</span>
    	</div>
    	<div>Words: <span id="wordCount">0</span>
    	</div>
    </div>

    Counting words, characters, and sentences

    Let’s now explore the regular expressions used for counting words and sentences.

    To find words in an input string, we need two things:

    • Word boundaries: \b
    • Valid word characters: \w

    To increase accuracy, we also look for words with hypens (-). We want hyphenated words like “front-end” to be counted as one word instead of two or more.

    Here’s our JavaScript regex code for word counting:

    var words = input.value.match(/\b[-?(\w+)?]+\b/gi);
    • \b matches word boundaries – the starting or ending of a word
    • \w+ matches one or more word characters – the + takes care of the “one or more” part
    • -? matches hyphens so that words like “front-end” are counted as one word instead of two – using a ? at the end makes it optional
    • + at the end of the pattern matches one or more occurrences of the whole pattern
    • i makes our regex pattern case insensitive
    • g instructs our pattern to do a global search instead of stopping at the first match

    Sentences are a little easier to handle because we can just look for sentence separators (or delimiters) and split whenever we find them. JavaScript code for doing this could look like:

    var sentences = input.value.split(/[.|!|?]/g);

    In the above code, we have a pattern that looks for 3 characters: period (.), exclamation mark (!), and question mark (?)
    These are the 3 commonly used sentence separators.

    After executing the above line of code, the sentences variable will hold an array containing all the sentences.

    But here’s a tricky situation. What if the text we’re counting is something like “talk to you later…”? Because of the three sequential periods, our sentences array will hold 3 strings – one correct sentence, and two empty strings.

    To fix this, we modify our sentence calculation code to this instead:

    var sentences = input.value.split(/[.|!|?]+/g);

    The + at the end of the pattern now helps us properly deal with consecutive sentence delimiters so that “talk to you later…” is now correctly counted as one sentence.

    Counting keywords

    When you start typing (or paste) text into the above word counter, you will notice that a “top keywords” container automatically appears at the bottom of the tool. This keywords section displays the four top keywords in the entered text and the number of times each keyword occurs.

    Keyword counting has various uses. One common use is to prevent yourself from overusing certain keywords in your writing.

    Let’s discuss how to calculate it…

    Counting keywords: Step 1 – Remove all stop words

    Stop words are the most common words in any language. Before doing keyword analysis on our input text, we need to first filter out all the stop words. But since there is no universally accepted list of stop words in English, you will have to do a search by yourself and choose good stop words list you find.

    Instead of doing your own search, you can also use my own list of stop words. It is contained in the JavaScript section of the full code (provided below).

    To filter out stop words, we will be using this code:

    var nonStopWords = [];
    var stopWords = ["a", "able", "about", "above", "abst", "accordance", "according", ...];
    for (var i = 0; i < words.length; i++) {
      if (stopWords.indexOf(words[i].toLowerCase()) === -1 && isNaN(words[i])) {
    	nonStopWords.push(words[i].toLowerCase());
      }
    }

    The stopWords array contains all the the words we want to check against. If a word does not exist in the stopWords array, we add it to the nonStopWords array. If a word exists in the stopWords array, we ignore it. We also ignore all the numbers using the isNaN condition.

    Counting keywords: Step 2 – Make an object containing keywords and their count

    In this step we create a object called keywords. We then loop through the words in our nonStopWords array checking if the keywords object already contains the current word in the nonStopWords array.

    If the word already exists in the keywords object, we increment it’s value by one. If not, we create a new key value pair (the key is the word, and the value is 1).

    var keywords = {};
    for (var i = 0; i < nonStopWords.length; i++) {
    	if (nonStopWords[i] in keywords) {
    		keywords[nonStopWords[i]] += 1;
    	}
    	else {
    		keywords[nonStopWords[i]] = 1;
    	}
    }

    Counting keywords: Step 3 – Sort the keywords object

    In order to use the native sort method in JavaScript on the keywords object, we first convert it to a 2-dimensional array.

    var sortedKeywords = [];
    for (var keyword in keywords) {
    	sortedKeywords.push([keyword, keywords[keyword]])
    }
    sortedKeywords.sort(function(a, b) {
    	return b[1] - a[1]
    });

    We now have a 2D array named sortedKeywords. We use this in the 4th step below.

    Counting keywords: Step 4 – Display the top 4 keywords and their count

    Now, we display the first four elements of the sortedKeywords array (if there are less than four items, we display whatever number of items the array has). For each item, the word is at position 0, and the count is at position 1.

    We create a new HTML list item (li) for each entry and append it to our ul with the ID of topKeywords:

    topKeywords.innerHTML = "";
    for (var i = 0; i < sortedKeywords.length && i < 4; i++) {
    	var li = document.createElement('li');
    	li.innerHTML = "<b>" + sortedKeywords[i][0] + "</b>: " + sortedKeywords[i][1];
    	topKeywords.appendChild(li);
    }

    Here’s a full dump of the code that powers this tool

    The HTML code…

    <div class="ehi-wordcount-container">
    	<textarea placeholder="Enter your text here..."/>
    	<div class="output row">
    		<div>Characters: <span id="characterCount">0</span>
    		</div>
    		<div>Words: <span id="wordCount">0</span>
    		</div>
    	</div>
    	<div class="output row">
    		<div>Sentences: <span id="sentenceCount">0</span>
    		</div>
    		<div>Paragraphs: <span id="paragraphCount">0</span>
    		</div>
    	</div>
    	<div class="output row">
    		<div>Reading Time: <span id="readingTime">0</span>
    		</div>
    	</div>
    	<div class="keywords">Top keywords:<ul id="topKeywords"/>
    	</div>
    </div>

    The CSS code…

    .ehi-wordcount-container {
      margin: 2% auto;
      padding: 15px;
      background-color: #FFFFFF;
      -webkit-box-shadow: 0px 1px 4px 0px rgba(0, 0, 0, 0.2);
      box-shadow: 0px 1px 4px 0px rgba(0, 0, 0, 0.2);
    }
    
    .ehi-wordcount-container textarea {
      width: 100%;
      height: 300px;
      padding: 10px;
      border: 1px solid #d9d9d9;
      outline: none;
      font-size: 1em;
      resize: none;
      line-height: 1.5em;
    }
    
    .ehi-wordcount-container textarea:hover {
      border-color: #C0C0C0;
    }
    
    .ehi-wordcount-container textarea:focus {
      border-color: #4D90FE;
    }
    
    .ehi-wordcount-container .output.row {
      width: 100%;
      border: 1px solid #DDD;
      font-size: 1.4em;
      margin: 1% 0;
      background-color: #F9F9F9;
    }
    
    .ehi-wordcount-container .output.row div {
      display: inline-block;
      width: 42%;
      padding: 10px 15px;
      margin: 1%;
    }
    
    .ehi-wordcount-container .output.row span {
      font-weight: bold;
    }
    
    .ehi-wordcount-container .keywords {
      display: none;
      font-size: 1.4em;
      font-weight: 900;
    }
    
    .ehi-wordcount-container .keywords p {
      margin: 0px;
      padding: 0px;
    }
    
    .ehi-wordcount-container .keywords ul {
      font-weight: 400;
      border: 1px solid #DDD;
      font-size: 1em;
      background-color: #F9F9F9;
      margin: 1% 0;
    }
    
    .ehi-wordcount-container .keywords li {
      display: inline-block;
      width: 44%;
      padding: 10px;
      margin: 1%;
    }

    Note that the actual styles that apply to the word counter above are also influenced by overall styles used on this website.

    So, just using my CSS styles above may not give you exactly the same look and feel. But you should get something close enough.

    You can quite easily make additional tweaks to the entire CSS code to make the tool look however you want.

    Since CSS controls styles and appearance only, you can technically get a fully functional word counter even if you ignore the above CSS entirely. It just may not look very good.

    And now the JavaScript code… This is where the real magic happens.

    "use strict";
    var input = document.querySelectorAll('textarea')[0],
    	characterCount = document.querySelector('#characterCount'),
    	wordCount = document.querySelector('#wordCount'),
    	sentenceCount = document.querySelector('#sentenceCount'),
    	paragraphCount = document.querySelector('#paragraphCount'),
    	readingTime = document.querySelector('#readingTime'),
    	keywordsDiv = document.querySelectorAll('.keywords')[0],
    	topKeywords = document.querySelector('#topKeywords');
    
    input.addEventListener('keyup', function() {
    	console.clear();
    	characterCount.innerHTML = input.value.length;
    	var words = input.value.match(/\b[-?(\w+)?]+\b/gi);
    	if (words) {
    		wordCount.innerHTML = words.length;
    	}
    	else {
    		wordCount.innerHTML = 0;
    	}
    
    	if (words) {
    		var sentences = input.value.split(/[.|!|?]+/g);
    		console.log(sentences);
    		sentenceCount.innerHTML = sentences.length - 1;
    	}
    	else {
    		sentenceCount.innerHTML = 0;
    	}
    
    	if (words) {
    		var paragraphs = input.value.replace(/\n$/gm, '').split(/\n/);
    		paragraphCount.innerHTML = paragraphs.length;
    	}
    	else {
    		paragraphCount.innerHTML = 0;
    	}
    
    	if (words) {
    		var seconds = Math.floor(words.length * 60 / 275);
    		if (seconds > 59) {
    			var minutes = Math.floor(seconds / 60);
    			seconds = seconds - minutes * 60;
    			readingTime.innerHTML = minutes + "m " + seconds + "s";
    		}
    		else {
    			readingTime.innerHTML = seconds + "s";
    		}
    	}
    	else {
    		readingTime.innerHTML = "0s";
    	}
    
    	if (words) {
    		var nonStopWords = [];
    		var stopWords = ["a", "able", "about", "above", "abst", "accordance", "according", "accordingly", "across", "act", "actually", "added", "adj", "affected", "affecting", "affects", "after", "afterwards", "again", "against", "ah", "all", "almost", "alone", "along", "already", "also", "although", "always", "am", "among", "amongst", "an", "and", "announce", "another", "any", "anybody", "anyhow", "anymore", "anyone", "anything", "anyway", "anyways", "anywhere", "apparently", "approximately", "are", "aren", "arent", "arise", "around", "as", "aside", "ask", "asking", "at", "auth", "available", "away", "awfully", "b", "back", "be", "became", "because", "become", "becomes", "becoming", "been", "before", "beforehand", "begin", "beginning", "beginnings", "begins", "behind", "being", "believe", "below", "beside", "besides", "between", "beyond", "biol", "both", "brief", "briefly", "but", "by", "c", "ca", "came", "can", "cannot", "can't", "cause", "causes", "certain", "certainly", "co", "com", "come", "comes", "contain", "containing", "contains", "could", "couldnt", "d", "date", "did", "didn't", "different", "do", "does", "doesn't", "doing", "done", "don't", "down", "downwards", "due", "during", "e", "each", "ed", "edu", "effect", "eg", "eight", "eighty", "either", "else", "elsewhere", "end", "ending", "enough", "especially", "et", "et-al", "etc", "even", "ever", "every", "everybody", "everyone", "everything", "everywhere", "ex", "except", "f", "far", "few", "ff", "fifth", "first", "five", "fix", "followed", "following", "follows", "for", "former", "formerly", "forth", "found", "four", "from", "further", "furthermore", "g", "gave", "get", "gets", "getting", "give", "given", "gives", "giving", "go", "goes", "gone", "got", "gotten", "h", "had", "happens", "hardly", "has", "hasn't", "have", "haven't", "having", "he", "hed", "hence", "her", "here", "hereafter", "hereby", "herein", "heres", "hereupon", "hers", "herself", "hes", "hi", "hid", "him", "himself", "his", "hither", "home", "how", "howbeit", "however", "hundred", "i", "id", "ie", "if", "i'll", "im", "immediate", "immediately", "importance", "important", "in", "inc", "indeed", "index", "information", "instead", "into", "invention", "inward", "is", "isn't", "it", "itd", "it'll", "its", "itself", "i've", "j", "just", "k", "keep", "keeps", "kept", "kg", "km", "know", "known", "knows", "l", "largely", "last", "lately", "later", "latter", "latterly", "least", "less", "lest", "let", "lets", "like", "liked", "likely", "line", "little", "'ll", "look", "looking", "looks", "ltd", "m", "made", "mainly", "make", "makes", "many", "may", "maybe", "me", "mean", "means", "meantime", "meanwhile", "merely", "mg", "might", "million", "miss", "ml", "more", "moreover", "most", "mostly", "mr", "mrs", "much", "mug", "must", "my", "myself", "n", "na", "name", "namely", "nay", "nd", "near", "nearly", "necessarily", "necessary", "need", "needs", "neither", "never", "nevertheless", "new", "next", "nine", "ninety", "no", "nobody", "non", "none", "nonetheless", "noone", "nor", "normally", "nos", "not", "noted", "nothing", "now", "nowhere", "o", "obtain", "obtained", "obviously", "of", "off", "often", "oh", "ok", "okay", "old", "omitted", "on", "once", "one", "ones", "only", "onto", "or", "ord", "other", "others", "otherwise", "ought", "our", "ours", "ourselves", "out", "outside", "over", "overall", "owing", "own", "p", "page", "pages", "part", "particular", "particularly", "past", "per", "perhaps", "placed", "please", "plus", "poorly", "possible", "possibly", "potentially", "pp", "predominantly", "present", "previously", "primarily", "probably", "promptly", "proud", "provides", "put", "q", "que", "quickly", "quite", "qv", "r", "ran", "rather", "rd", "re", "readily", "really", "recent", "recently", "ref", "refs", "regarding", "regardless", "regards", "related", "relatively", "research", "respectively", "resulted", "resulting", "results", "right", "run", "s", "said", "same", "saw", "say", "saying", "says", "sec", "section", "see", "seeing", "seem", "seemed", "seeming", "seems", "seen", "self", "selves", "sent", "seven", "several", "shall", "she", "shed", "she'll", "shes", "should", "shouldn't", "show", "showed", "shown", "showns", "shows", "significant", "significantly", "similar", "similarly", "since", "six", "slightly", "so", "some", "somebody", "somehow", "someone", "somethan", "something", "sometime", "sometimes", "somewhat", "somewhere", "soon", "sorry", "specifically", "specified", "specify", "specifying", "still", "stop", "strongly", "sub", "substantially", "successfully", "such", "sufficiently", "suggest", "sup", "sure", "t", "take", "taken", "taking", "tell", "tends", "th", "than", "thank", "thanks", "thanx", "that", "that'll", "thats", "that've", "the", "their", "theirs", "them", "themselves", "then", "thence", "there", "thereafter", "thereby", "thered", "therefore", "therein", "there'll", "thereof", "therere", "theres", "thereto", "thereupon", "there've", "these", "they", "theyd", "they'll", "theyre", "they've", "think", "this", "those", "thou", "though", "thoughh", "thousand", "throug", "through", "throughout", "thru", "thus", "til", "tip", "to", "together", "too", "took", "toward", "towards", "tried", "tries", "truly", "try", "trying", "ts", "twice", "two", "u", "un", "under", "unfortunately", "unless", "unlike", "unlikely", "until", "unto", "up", "upon", "ups", "us", "use", "used", "useful", "usefully", "usefulness", "uses", "using", "usually", "v", "value", "various", "'ve", "very", "via", "viz", "vol", "vols", "vs", "w", "want", "wants", "was", "wasn't", "way", "we", "wed", "welcome", "we'll", "went", "were", "weren't", "we've", "what", "whatever", "what'll", "whats", "when", "whence", "whenever", "where", "whereafter", "whereas", "whereby", "wherein", "wheres", "whereupon", "wherever", "whether", "which", "while", "whim", "whither", "who", "whod", "whoever", "whole", "who'll", "whom", "whomever", "whos", "whose", "why", "widely", "willing", "wish", "with", "within", "without", "won't", "words", "world", "would", "wouldn't", "www", "x", "y", "yes", "yet", "you", "youd", "you'll", "your", "youre", "yours", "yourself", "yourselves", "you've", "z", "zero"];
    		for (var i = 0; i < words.length; i++) {
    			if (stopWords.indexOf(words[i].toLowerCase()) === -1 && isNaN(words[i])) {
    				nonStopWords.push(words[i].toLowerCase());
    			}
    		}
    		var keywords = {};
    		for (var i = 0; i < nonStopWords.length; i++) {
    			if (nonStopWords[i] in keywords) {
    				keywords[nonStopWords[i]] += 1;
    			}
    			else {
    				keywords[nonStopWords[i]] = 1;
    			}
    		}
    
    		var sortedKeywords = [];
    		for (var keyword in keywords) {
    			sortedKeywords.push([keyword, keywords[keyword]])
    		}
    		sortedKeywords.sort(function(a, b) {
    			return b[1] - a[1]
    		});
    
    		topKeywords.innerHTML = "";
    		for (var i = 0; i < sortedKeywords.length && i < 4; i++) {
    			var li = document.createElement('li');
    			li.innerHTML = "<b>" + sortedKeywords[i][0] + "</b>: " + sortedKeywords[i][1];
    			topKeywords.appendChild(li);
    		}
    	}
    
    	if (words) {
    		keywordsDiv.style.display = "block";
    	}
    	else {
    		keywordsDiv.style.display = "none";
    	}
    });

    Hope you have enjoyed learning about how a JavaScript word counter works. If you loved this tool, you may also like my popular actual size online ruler and my AJAX-based WordPress password (or plain text) hasher.

    Found this article valuable? Want to show your appreciation? Here are some options:

    1. Spread the word! Use these buttons to share this link on your favorite social media sites.
    2. Help me share this on . . .

      • Facebook
      • Twitter
      • LinkedIn
      • Reddit
      • Tumblr
      • Pinterest
      • Pocket
      • Telegram
      • WhatsApp
      • Skype
    3. Sign up to join my audience and receive email notifications when I publish new content.
    4. Contribute by adding a comment using the comments section below.
    5. Follow me on Twitter, LinkedIn, and Facebook.

    Related

    Filed Under: Frontend (Client-Side), JavaScript, Programming, Tools & Resources, Web Development Tagged With: Character Counter, JavaScript, Tools, Word Counter

    About Ehi Kioya

    I am a Toronto-based Software Engineer. I run this website as part hobby and part business.

    To share your thoughts or get help with any of my posts, please drop a comment at the appropriate link.

    You can contact me using the form on this page. I'm also on Twitter, LinkedIn, and Facebook.

    Reader Interactions

    Comments

    1. John says

      January 6, 2020 at 4:40 am

      Hi Ehi,

      I made something similar for a newspaper years ago using AJAX, PHP explode and count on the server side to deal with strings including numbers.

      One thing I found was the need to check that there are no instances of two words joined by a comma or full stop and no space, eg ‘test,me’ needs to be transformed to ‘test, me’.

      Just my 2c 🙂
      Regards
      John

      Reply
      • Ehi Kioya says

        January 6, 2020 at 7:10 am

        Hi John,

        Good point! Even while creating this I knew there would be quite a few word combination scenarios and edge cases I may not have considered yet.

        Thanks for pointing this out. I will look into modifying my regex to handle that situation.

        Reply

    Leave a Reply Cancel reply

    Your email address will not be published. Required fields are marked *

    Primary Sidebar

    26,207
    Followers
    Follow
    30,000
    Connections
    Connect
    14,641
    Page Fans
    Like

    POPULAR   FORUM   TOPICS

    • How to find the title of a song without knowing the lyrics
    • Welcome Message
    • How To Change Or Remove The WordPress Login Error Message
    • The Art of Exploratory Data Analysis (Part 1)
    • Replacing The Default SQLite Database With PostgreSQL In Django
    • Getting Started with SQL: A Beginners Guide to Databases
    • How To Create Tooltips Using CSS3
    • How To Use A Custom Bootstrap Template With Laravel
    • SEO Basics – Crawling, Indexing And Ranking
    • How To Stay Relevant In The Tech Space
    • Recently   Popular   Posts   &   Pages
    • Actual Size Online Ruler Actual Size Online Ruler
      I created this page to measure your screen resolution and produce an online ruler of actual size. It's powered with JavaScript and HTML5.
    • Allowing Multiple RDP Sessions In Windows 10 Using The RDP Wrapper Library Allowing Multiple RDP Sessions In Windows 10 Using The RDP Wrapper Library
      This article explains how to bypass the single user remote desktop connection restriction on Windows 10 by using the RDP wrapper library.
    • WordPress Password Hash Generator WordPress Password Hash Generator
      With this WordPress Password Hash Generator, you can convert a password to its hash, and then set a new password directly in the database.
    • Forums
    • About
    • Contact

    © 2021   ·   Ehi Kioya   ·   All Rights Reserved
    Privacy Policy