In this project, our goal is to explore a framework that is able to automatically extract feature from unstructured text and evaluate those features in various methods, in order to explore the performance of feature extraction models on different text dataset. The result will be considered and used in helping extract features from Rabobank’s email dataset. Due to the variability of languages, it has always been a difficult problem to transform language data to digits that computer understands. When it comes to machine learning with text dataset, one classical method to transform text to numbers is called bag-of-words (BOW), which takes the occurrence of words as features. BOW is powerful and sample, but it ignores the semantic meaning, a key part for people to express their opinions or feelings. Unlike bag-of-words, which discords context information, like grammar and order of words, word embedding has the ability to contain semantic meaning of words. Word embedding is able to find synonyms or words that appear in similar context. However, despite the success of word embedding, there is no standard way to extract document representation. Due to that fact we try to build a framework that implements different models to extract document representations from word embedding, as well as classical BOW models, also, the framework is able to evaluate those features in different classification or clustering models. We use different evaluation methods to check the robustness and stability of these classification or clustering model, for instance, F1-score, Cohen’s kappa, purity or variance. During the experiments, we try to find the best combinations for the model, like parameter of model, layers of the neural networks, or dimension of features in order to get better performance and make model more efficient.