Today I was given a 150-page MS Word document and was asked to pull out all of the email address within the document and place them into a MS Excel file.
Since the email addresses were imbedded throughout the document’s text, I couldn’t just sort, cut, and paste them. I needed to find a way to pull them from the text. Here’s how I did it.
- In MS Word, use Find and Replace to find every “space” and replace it with a new line break “^”. Since email addresses don’t have spaces, they will all appear on a new line.
- Copy the extremely long document into MS Excel. All copy will appear in a single column.
- Highlight that column and choose Text to Columns within MS Excel’s Data menu.
- For the Delimiter, input the AT symbol “@”. This will spilt all email address into the second column, all other content will remain in the first column.
- Highlight both columns and sort the content by Column B. Now all of the email address are in the top rows.
- Copy all of the rows that have content in two rows and paste it back into a new MS Word file. Make sure you paste as plain text.
- All that’s left is for you to Find and Replace every Tab “^t” with the at symbol “@”