dev-resources.site
for different kinds of informations.
Simplified Email Extraction with Javascript Bookmark
A scenario came up where there was a need to extract email addresses off of a web page. This was going to be a repetitive task, loading a similar page structure, but with different email addresses to extract from each page. The end goal was to copy and paste these email addresses into a spreadsheet. Creating a bookmark that ran Javascript was the simplest approach to extract the emails. Let’s learn how!
Extract Emails
In my particular example, the emails we wanted to extract were inside of a table that had a specific id. Again, the goal was to extract these emails, copy them to the clipboard, and manually paste them into a spreadsheet.
Let’s start by creating an immediately invoked function expression (IIFE). This will keep all the variables neatly scoped to the function itself. Because we want to use await later in the code, we also define the function as async.
(async () => {
})();
The first thing we should check is if we’re actually on the web page that this code will work on. If we’re not, we should show an alert and throw an error to stop the rest of the code from running.
if (!window.location.href.startsWith('URL_HERE')) {
const hrefError = 'This bookmark cannot be used on this webpage.';
alert(hrefError);
throw hrefError;
}
Let’s find that table by its unique id.
const table = document.getElementById('ID_HERE');
If the table can’t be found, we should show an alert and throw an error to stop the rest of the code from running.
if (!table) {
const tableError = 'Table not found';
alert(tableError);
throw tableError);
}
Now let’s look for all the links within the table.
const links = table.getElementsByTagName('a');
Let’s also create a variable called emails, which will be an empty array to start, so we can store all the emails we find.
const emails = [];
We need to loop through all the links we found in the table and see if the href attribute starts with mailto. If it does, then we’ve found an email! We’ll remove the mailto: prefix and be left with the email that we can add to the emails array.
for (const link of links) {
const href = link.getAttribute('href');
if (href && href.startsWith('mailto:')) {
emails.push(href.substring(7));
}
}
Now that we have an array of emails, let’s copy them to the clipboard as a string with a newline character in between each email. This better formats the data in preparation for pasting them into the spreadsheet. We use await because the clipboard api is an asynchronous function.
await navigator.clipboard.writeText(emails.join('\n'));
All that’s left to do is show how many emails were copied to the clipboard.
alert(`${emails.length} emails copied to clipboard`);
Here’s the final code:
(async () => {
if (!window.location.href.startsWith('URL_HERE')) {
const hrefError = 'This bookmark cannot be used on this webpage.';
alert(hrefError);
throw hrefError;
}
const table = document.getElementById('ID_HERE');
if (!table) {
const tableError = 'Table not found';
alert(tableError);
throw tableError;
}
const links = table.getElementsByTagName('a');
const emails = [];
for (const link of links) {
const href = link.getAttribute('href');
if (href && href.startsWith('mailto:')) {
emails.push(href.substring(7));
}
}
await navigator.clipboard.writeText(emails.join('\n'));
alert(`${emails.length} emails copied to clipboard`);
})();
Why A Bookmark?
When I first started working on this, I tried using a Google Chrome Snippet. When I got to the point of copying to the clipboard, it didn’t work. The reason is because the user must take action, like clicking a button, before allowing the clipboard to be used.
I injected a button into the page that when clicked would run the function to get emails and copy them to the clipboard. However, that meant the user had to open the browser’s developer tools, run the snippet, then click the button. Too many steps, especially for someone that might not be tech savvy or would be confused with using something like the developer tools.
I could also create a Google Chrome Extension, but this isn’t something I wanted to add to the store of course! This was for a very specific use case. I would have to develop the extension, package it up, and explain to the user how to manually load an unpacked extension! Again, not easy for someone that might not be tech savvy.
A bookmark was easier! You can run javascript in a bookmark url!
Create Bookmark
Within Google Chrome’s menu, navigate to Bookmarks and lists > Bookmark manager. Under the Bookmark manager menu, choose Add new bookmark.
At the Name field, give the bookmark a name like Extract Emails. At the URL field, begin by typing javascript: (yes, include the colon after the word javascript) and then paste in the code to extract emails from above. Save the bookmark.
If you don’t have the bookmarks bar visible, go to Google Chrome’s menu and choose Bookmarks and lists > Show bookmarks bar. The bookmark you just created should be visible.
Visit the url, click the bookmark, and you should get an alert showing you the number of emails copied to the clipboard!
Visit our website at https://nightwolf.dev and follow us on Twitter!
Featured ones: