Project Review: Text Analysis

Counting Words and Letters Found Within a Text String

Summary

Introduction

The text analysis app is a demonstration of using Python to calculate the number of occurrences of each of the words and letters used within the string input/pasted by the user. No models are required for this app, however Django's views handles the request/response cycle and the rendering of the applicable context is handled using a Django template.

There are also some mathematical calculations for the rate of any given letter appearing within the text being evaluated.

Features

  • Uses Python lists to store results from the various utility functions
  • Uses Pythonic for [each] loop to iterate over every letter of the alphabet to test for presence
  • Creates a context dictionary with several elements used within the Django template to give the user information about the text they have chosen to analyse
  • Provides a button to use standard 'lorem ipsum' text to save the user from typing or pasting their own. Gives users feedback if they don't use any text.

Objectives

To use Python to calculate the number of instances of a given word and letter within a text string.

This was one of my earlier scripting projects that I completed before I ever started to build a Django portfolio in earnest. I added it to my Django portfolio at a later date and built upon it with additional polish features both from a refactoring of back end code perspective and improved presentation of the tables on the front-end.

The Approach & Solution

I started by thinking about the algorithm and approach to calculate the frequency of words and letters within the text block. It seemed reasonable to me that I would need to split the string into two operational lists.

The first would be a list of words that could be looped over and counted for their frequency of occurrence.

The second would be a list of letters that could be looped over and counted for their frequency of occurrence.

The next part would be to sort the lists in a logical order so that they could be rendered usefully to the user.

For words, it made sense that the list should be sorted by frequency of occurrence. For letters, it made sense that the list should be sorted alphabetically.

Evaluation

Following a refactor of the project, the code is now easier to understand and maintain.

The code has been extracted into a 'views.py' file where functions relating to the rendering of Django views lives.

The remainder of the code are utility functions that have been moved into a separate module called 'utils.py'.

Languages, Technologies & Skills Used

In approximate order of frequency used...
Languages: Python, HTML, JavaScript, Sass
Frameworks / Services: Django, Bootstrap, Font Awesome
Software: VS Code
Libraries: re, string, typing
Notable Packages: N/A
Infrastructure: GitHub, Docker, Poetry