The project aims to find out whether the recently popular attention-based model of Transformer could perform better than the previous baseline proposed in works. We will compare the performance of the transformer model using a similar input setup to the work of Behavior Sequence Transformer to a baseline Long Short-Term Memory model using the same input setup. We measure the performance of the two models on the dataset MovieLens which contains 3,900 movies, 6,040 users, and 1,000,209 ratings, and each user has rated at least 20 movies, with integer scores ranging from 1 to 5. MovieLens is widely used to test and develop recommendation algorithms, especially collaborative filtering algorithms. The two models are trained and validated on predicting user ratings based on sequences of feature inputs. The Long Short-Term memory model achieves an average root mean square error of approximately 1.3547 on validation data, and the transformer model achieves an average root mean square error of approximately 1.2122 on validation data.