18d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi# Copyright 2015 The Chromium Authors. All rights reserved. 28d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi# Use of this source code is governed by a BSD-style license that can be 38d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi# found in the LICENSE file. 48d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi 58d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi"""Code for parsing HTML. 68d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi 78d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi KandoiThe purpose of this module is to ensure consistency of HTML parsing 88d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoiin catapult_build. 98d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi""" 108d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi 118d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoiimport bs4 128d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi 138d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi 148d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoidef BeautifulSoup(contents): 158d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi # html5lib is a lenient parser; compared with the default parser, 168d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi # it is more similar to how a web browser parses. See: 178d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi # http://www.crummy.com/software/BeautifulSoup/bs4/doc/#installing-a-parser 188d2b206a675ec20ea07100c35df34e65ee1e45e8Ruchi Kandoi return bs4.BeautifulSoup(markup=contents, features='html5lib') 19