Sunday Smiles

My baby is turning into a little boy and I’m amazed every day by him. Owen crawled into bed with me about six. His hair in complete sleep mode, and in desperate need of a haircut. I cut his hair…

Smartphone

独家优惠奖金 100% 高达 1 BTC + 180 免费旋转




Extracting String Patterns from Excel Files using Python and RegEx

I business life you often end up with tedious repetitive tasks. Recently, I had to look for nine-digit Reference numbers in a few hundred more or less unstructured Excel files. To make this task even more tedious the Excel files could contain multiple sheets, which couldn’t be ignored. Since those numbers were part of a backup, it was better to collect them all.

My first impulse was to bite the bullet and simply do the job manually: open each Excel File and then go through the sheets, find and copy the relevant numbers into a text file. This job would have taken me at least one day with the risk of missing a few numbers, which would then end up in an incomplete backup.

However, then I got the idea that this task would be a perfect fit for Python and RegEx. The Python library Openpyxl allows to read/write Excel files, RegEx allows the identification of string patterns, in my case the relevant nine-digit numbers.

This is how I did the job:

Open a Jupyter Notebook and import the relevant libraries

Now we can iterate over all files in the directory and load each Excel File to a workbook

workbook.sheetnames gives us a list of all sheet names in that (opened) Excel file.

To get the content in each sheet, we need to iterate over all rows and columns

Now comes the step where we have to apply RegEx. So we can pull out all occurrences of the nine-digit numbers.

The example below shows how RegEx works, the operator \d{9} takes only nine-digit numbers into account, strings and lower digit numbers will be ignored.

This is the full code:

Add a comment

Related posts:

Mengapa prinsip The Simplicity pada desain antarmuka penting?

The Simplicity Principle adalah prinsip mengubah sesuatu yang rumit ke dalam format yang lebih sederhana sehingga pengguna dapat berinteraksi dan memahami dengan mudah. Desain antarmuka dirancang…

What car insurance are you with?? UK?

Looking around for quotes for a first car finding my excess to be 3000 with most companies and insurance 2000+ Who do you insure with? Recommendations? ? ANSWER: I would recommend you to visit this…