Facial action unit detection (FAUD) is aimed to detect subtle facial motions, known as action units (AUs), induced by the traction or towing of facial muscles. In recent years, FAUD has become an attractive task since these slight changes on face reveal some cues of emotions which can then be utilized to infer people's affective. Many prior works are developed based on physical features of AUs, such as regional occurrence and temporal continuity, but few of them concern the negative impact of individual differences, causing an unrobust result in variant subjects and environments. To deal with the problem, a contrastive feature learning method is proposed to make a convolutional neural network (CNN) learn to extract the contrastive feature which is the difference between the features of a neutral face image and an AU-occurred face image. In this way, the individual information will be mitigated so that it becomes easier to detect AUs according to the features. A great disparity between the number of positive and negative samples, named as data imbalance, makes a serious interference in most detection problems, including FAUD. A lower occurrence rate of an AU leads a more severe data imbalance problem, which results in a biased result for the AU detection. Therefore, the class-weighted loss is proposed to change the weight-ratio between the positive and negative samples so as to encourage the learning of the positive samples and ease that of negative samples. Two widely used databases, BP4D and DISFA, are adopted to be the benchmark of the performance testing. In the experiment, it shows that the proposed method performs well both in accuracy and speed comparing with the state-of-the-art approaches.