The overall visual attributes (e.g., aesthetics) of Web pages significantly influence user experience. A beautiful and well laid out Web page greatly facilitates user access and enhances the browsing experience. In this paper, a new method is proposed to learn an assessment model for the (visual) aesthetics of Web pages. First, multimodal features (structural, local visual, global visual, and functional) of a Web page that are known to significantly affect the aesthetics of a Web page are extracted to construct a feature vector. Second, the interuser disagreement of aesthetics is analyzed and novel aesthetic representations are obtained from the multiuser ratings of a page. A structural learning algorithm is proposed for the new aesthetic representations. Third, as a Web page s functional purpose also affects the perceived aesthetics, we divide Web pages into different types using functional features, and a soft multitask fusion learning strategy is introduced to train assessment models for pages with functional purposes. Experimental results show the effectiveness of our method: 1) the combination of structural, local, and global visual features outperforms existing state-of-the-art Web aesthetic features; 2) the proposed structural learning algorithm achieves good results for the new aesthetic representations; and 3) the proposed soft multitask fusion learning strategy improves the performances of aesthetics assessment models.