Our first open-source contribution: HTML Comment Extractor (HCE)

We always wanted to give to the testing community and that is why we have a separate dedicated team which makes sure we are consistently giving to the community. It is not that it is a good thing to give to the community, but we feel good by doing so and that is sufficient for us. And anything that we get back from the community is always a bonus for us and makes us feel good too. This is just a start and we shall contribute to the testing community whenever our hearts tell us to give. And we would love to say this, “We make sure we dedicate some time in a day to think of such ideas which can add value to our Software Testing community where testers or developers or managers can get benefited in testing their software.”

Why did we develop HTML Comment Extractor / Parser?

Long back, when we were working on a project on Retail Domain, we saw some sensitive details in the comments (HTML and Javascript) on the client side code of a web application. The comment revealed third-party analytics login credentials and one could easily get the analytics data if they land on these kind of comments. So, we thought of building a utility which could extract the comments from the URL we provide and then it runs through the extracted comments for the sensitive information. And sensitive information is defined by dictionary that has the possible keywords which a tester or a developer or test manager or anyone can define based on the context and the context is what they mean by sensitive information or any information that can help them.

This utility is developed using Python. We thank people responsible for developing this beautiful programming language Python. We also thank our team member Sandeep Tuppad who developed HCE for the testing community and Karthik Kini who got the product page for this utility running up (http://apps.testinsane.com/hce).

A quick overview: Here is pictorial representation of how HTML Comment Parser Utility works?

HTML Comment Extractor Pictorial Representation

For complete guide, use our ReadMe.txt file.
You can download the utility along with source code at http://apps.testinsane.com/hce/

We Love To Help

If you face any difficulties in using it or if you have any ideas, please feel free to share with us and we can work together in building something and giving back to the community if there is a mutual interest. Write to us at welovetohelp@apps.testinsane.com.

10 comments on “Our first open-source contribution: HTML Comment Extractor (HCE)

  1. Its really wonderful to know about it. It will definitely help the test engineers, developers, and initiate a move towards clean, secure coding too..

    • Narayan, Thank you for your comment. We will keep releasing such utilities to the testing community whenever we feel like! Thanks again :-)

  2. Hi Santhosh,

    Its really a fantastic tool, and am sure its gonna help all the test engineers. I am really appreciate that you guys have used Python language to develop this utility. I always believe you guys will rock in all aspects. And specially should talk about your web site ( https://testinsane.com/), its nicely developed. Expecting many more utilities like this and help us.


    • Thank you Sreekaram Sreekar :-) Surely, we would release more such utilities as free or open-source or micro-service! Thanks again for your lovely comment.

  3. Hi Santosh,
    I appreciate your effort to bring open source utilities to the community. It is a noble idea. Although small, this particular utility is definitely helpful.


    • Thanks LN. And thanks to our team member Sandeep Tuppad for executing this idea! Please share it with your team and others!

  4. Hi santhosh,

    I have used this feature of extracting the comments from the web hits in web scarab tool except the intelligence used by this utility of extracting the comments of particular keywords.

    Does this spider the whole website link and extract the comments? – I tried to downloaded the utility my system prompts only with ‘dismiss’. can’t able to download it.


    • Thanks Amjath for your comment and letting us know about your experience.

      -Does this spider the whole website link and extract the comments?-
      No, it doesn’t have a crawler feature in the current version. It goes through all the URL(s) mentioned in the excel sheet and extracts the comments with particular keywords (As mentioned in excel sheet again) from the specific URL(s) mentioned.

      Note: We have tested it on 64 bit machine (Windows 7) and if it doesn’t work on 32-bit, please compile the source code to use it on 32-bit machine.

      If you are speaking about “Discard” issue that comes up in Chrome web browser, then you need to say “Keep” by clicking the drop down arrow instead of clicking on Discard.

      Here is the image URL if you can’t see the image above: https://testinsane.com/blog/wp-content/uploads/2014/11/Chrome_Discard_HCE_Message.png

  5. Amjath, I guess that it is something to do with your chrome configuration. I insist that you download it using Firefox or Safari or Internet Explorer whichever helps you to download it as the mission is to use HCE and not worry about other things. You can investigate that problem if you are interested to. Let us focus on HCE as of now :-) And again your Chrome may still block other things on sourceforge which may be zip version.

    Consider it as false alarm, and just go ahead and download it from different client.

    Or else, I insist you to look into http://www.ghacks.net/2014/07/15/fix-chromes-safe-browsing-feature-blocking-downloads-browser/ and fix your chrome web browser to allow the download.

    Thanks :-)

Leave a Reply

Your email address will not be published. Required fields are marked *