Information Retrieval
- [1] arXiv:2405.09638 [pdf, ps, html, other]
-
Title: HMAR: Hierarchical Masked Attention for Multi-Behaviour RecommendationSubjects: Information Retrieval (cs.IR); Machine Learning (cs.LG)
In the context of recommendation systems, addressing multi-behavioral user interactions has become vital for understanding the evolving user behavior. Recent models utilize techniques like graph neural networks and attention mechanisms for modeling diverse behaviors, but capturing sequential patterns in historical interactions remains challenging. To tackle this, we introduce Hierarchical Masked Attention for multi-behavior recommendation (HMAR). Specifically, our approach applies masked self-attention to items of the same behavior, followed by self-attention across all behaviors. Additionally, we propose historical behavior indicators to encode the historical frequency of each items behavior in the input sequence. Furthermore, the HMAR model operates in a multi-task setting, allowing it to learn item behaviors and their associated ranking scores concurrently. Extensive experimental results on four real-world datasets demonstrate that our proposed model outperforms state-of-the-art methods. Our code and datasets are available here (this https URL).
- [2] arXiv:2405.10232 [pdf, ps, html, other]
-
Title: Beyond Static Calibration: The Impact of User Preference Dynamics on Calibrated RecommendationComments: 8 pages, 4 figures, accepted as LBR paper at UMAP '24 -- ACM Conference on User Modeling, Adaptation and Personalization 2024Subjects: Information Retrieval (cs.IR)
Calibration in recommender systems is an important performance criterion that ensures consistency between the distribution of user preference categories and that of recommendations generated by the system. Standard methods for mitigating miscalibration typically assume that user preference profiles are static, and they measure calibration relative to the full history of user's interactions, including possibly outdated and stale preference categories. We conjecture that this approach can lead to recommendations that, while appearing calibrated, in fact, distort users' true preferences. In this paper, we conduct a preliminary investigation of recommendation calibration at a more granular level, taking into account evolving user preferences. By analyzing differently sized training time windows from the most recent interactions to the oldest, we identify the most relevant segment of user's preferences that optimizes the calibration metric. We perform an exploratory analysis with datasets from different domains with distinctive user-interaction characteristics. We demonstrate how the evolving nature of user preferences affects recommendation calibration, and how this effect is manifested differently depending on the characteristics of the data in a given domain. Datasets, codes, and more detailed experimental results are available at: this https URL.
- [3] arXiv:2405.10311 [pdf, ps, html, other]
-
Title: UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language ModelsComments: 11 pages, 7 figuresSubjects: Information Retrieval (cs.IR)
Recently, Multi-Modal(MM) Large Language Models(LLMs) have unlocked many complex use-cases that require MM understanding (e.g., image captioning or visual question answering) and MM generation (e.g., text-guided image generation or editing) capabilities. To further improve the output fidelity of MM-LLMs we introduce the model-agnostic UniRAG technique that adds relevant retrieved information to prompts as few-shot examples during inference. Unlike the common belief that Retrieval Augmentation (RA) mainly improves generation or understanding of uncommon entities, our evaluation results on the MSCOCO dataset with common entities show that both proprietary models like GPT4 and Gemini-Pro and smaller open-source models like Llava, LaVIT, and Emu2 significantly enhance their generation quality when their input prompts are augmented with relevant information retrieved by MM retrievers like UniIR models.
New submissions for Friday, 17 May 2024 (showing 3 of 3 entries )
- [4] arXiv:2405.10024 (cross-list from cs.LG) [pdf, ps, html, other]
-
Title: $\Delta\text{-}{\rm OPE}$: Off-Policy Estimation with Pairs of PoliciesSubjects: Machine Learning (cs.LG); Information Retrieval (cs.IR)
The off-policy paradigm casts recommendation as a counterfactual decision-making task, allowing practitioners to unbiasedly estimate online metrics using offline data. This leads to effective evaluation metrics, as well as learning procedures that directly optimise online success. Nevertheless, the high variance that comes with unbiasedness is typically the crux that complicates practical applications. An important insight is that the difference between policy values can often be estimated with significantly reduced variance, if said policies have positive covariance. This allows us to formulate a pairwise off-policy estimation task: $\Delta\text{-}{\rm OPE}$.
$\Delta\text{-}{\rm OPE}$ subsumes the common use-case of estimating improvements of a learnt policy over a production policy, using data collected by a stochastic logging policy. We introduce $\Delta\text{-}{\rm OPE}$ methods based on the widely used Inverse Propensity Scoring estimator and its extensions. Moreover, we characterise a variance-optimal additive control variate that further enhances efficiency. Simulated, offline, and online experiments show that our methods significantly improve performance for both evaluation and learning tasks. - [5] arXiv:2405.10233 (cross-list from cs.SI) [pdf, ps, html, other]
-
Title: iDRAMA-Scored-2024: A Dataset of the Scored Social Media Platform from 2020 to 2023Subjects: Social and Information Networks (cs.SI); Computers and Society (cs.CY); Information Retrieval (cs.IR)
Online web communities often face bans for violating platform policies, encouraging their migration to alternative platforms. This migration, however, can result in increased toxicity and unforeseen consequences on the new platform. In recent years, researchers have collected data from many alternative platforms, indicating coordinated efforts leading to offline events, conspiracy movements, hate speech propagation, and harassment. Thus, it becomes crucial to characterize and understand these alternative platforms. To advance research in this direction, we collect and release a large-scale dataset from Scored -- an alternative Reddit platform that sheltered banned fringe communities, for example, c/TheDonald (a prominent right-wing community) and c/GreatAwakening (a conspiratorial community). Over four years, we collected approximately 57M posts from Scored, with at least 58 communities identified as migrating from Reddit and over 950 communities created since the platform's inception. Furthermore, we provide sentence embeddings of all posts in our dataset, generated through a state-of-the-art model, to further advance the field in characterizing the discussions within these communities. We aim to provide these resources to facilitate their investigations without the need for extensive data collection and processing efforts.
- [6] arXiv:2405.10248 (cross-list from cs.HC) [pdf, ps, html, other]
-
Title: Co-Matching: Towards Human-Machine Collaborative Legal Case MatchingComments: Draft V1: 23 pages, 7 figuresSubjects: Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR)
Recent efforts have aimed to improve AI machines in legal case matching by integrating legal domain knowledge. However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines. This emphasizes the crucial role of involving legal practitioners in high-stakes legal case matching. To address this, we propose a collaborative matching framework called Co-Matching, which encourages both the machine and the legal practitioner to participate in the matching process, integrating tacit knowledge. Unlike existing methods that rely solely on the machine, Co-Matching allows both the legal practitioner and the machine to determine key sentences and then combine them probabilistically. Co-Matching introduces a method called ProtoEM to estimate human decision uncertainty, facilitating the probabilistic combination. Experimental results demonstrate that Co-Matching consistently outperforms existing legal case matching methods, delivering significant performance improvements over human- and machine-based matching in isolation (on average, +5.51% and +8.71%, respectively). Further analysis shows that Co-Matching also ensures better human-machine collaboration effectiveness. Our study represents a pioneering effort in human-machine collaboration for the matching task, marking a milestone for future collaborative matching studies.
Cross submissions for Friday, 17 May 2024 (showing 3 of 3 entries )
- [7] arXiv:2303.04689 (replaced) [pdf, ps, html, other]
-
Title: A Privacy Preserving System for Movie Recommendations Using Federated LearningComments: Accepted for publication in the ACM Transactions on Recommender Systems (TORS) Special Issue on Trustworthy Recommender SystemsSubjects: Information Retrieval (cs.IR); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
Recommender systems have become ubiquitous in the past years. They solve the tyranny of choice problem faced by many users, and are utilized by many online businesses to drive engagement and sales. Besides other criticisms, like creating filter bubbles within social networks, recommender systems are often reproved for collecting considerable amounts of personal data. However, to personalize recommendations, personal information is fundamentally required. A recent distributed learning scheme called federated learning has made it possible to learn from personal user data without its central collection. Consequently, we present a recommender system for movie recommendations, which provides privacy and thus trustworthiness on multiple levels: First and foremost, it is trained using federated learning and thus, by its very nature, privacy-preserving, while still enabling users to benefit from global insights. Furthermore, a novel federated learning scheme, called FedQ, is employed, which not only addresses the problem of non-i.i.d.-ness and small local datasets, but also prevents input data reconstruction attacks by aggregating client updates early. Finally, to reduce the communication overhead, compression is applied, which significantly compresses the exchanged neural network parametrizations to a fraction of their original size. We conjecture that this may also improve data privacy through its lossy quantization stage.
- [8] arXiv:2305.19840 (replaced) [pdf, ps, html, other]
-
Title: BEIR-PL: Zero Shot Information Retrieval Benchmark for the Polish LanguageSubjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
The BEIR dataset is a large, heterogeneous benchmark for Information Retrieval (IR) in zero-shot settings, garnering considerable attention within the research community. However, BEIR and analogous datasets are predominantly restricted to the English language. Our objective is to establish extensive large-scale resources for IR in the Polish language, thereby advancing the research in this NLP area. In this work, inspired by mMARCO and Mr.~TyDi datasets, we translated all accessible open IR datasets into Polish, and we introduced the BEIR-PL benchmark -- a new benchmark which comprises 13 datasets, facilitating further development, training and evaluation of modern Polish language models for IR tasks. We executed an evaluation and comparison of numerous IR models on the newly introduced BEIR-PL benchmark. Furthermore, we publish pre-trained open IR models for Polish language,d marking a pioneering development in this field. Additionally, the evaluation revealed that BM25 achieved significantly lower scores for Polish than for English, which can be attributed to high inflection and intricate morphological structure of the Polish language. Finally, we trained various re-ranking models to enhance the BM25 retrieval, and we compared their performance to identify their unique characteristic features. To ensure accurate model comparisons, it is necessary to scrutinise individual results rather than to average across the entire benchmark. Thus, we thoroughly analysed the outcomes of IR models in relation to each individual data subset encompassed by the BEIR benchmark. The benchmark data is available at URL {\bf this https URL}.
- [9] arXiv:2401.12732 (replaced) [pdf, ps, other]
-
Title: CDRNP: Cross-Domain Recommendation to Cold-Start Users via Neural ProcessComments: Reorganize the logical structure of the manuscript and supplement with necessary experimentsSubjects: Information Retrieval (cs.IR); Social and Information Networks (cs.SI)
Cross-domain recommendation (CDR) has been proven as a promising way to tackle the user cold-start problem, which aims to make recommendations for users in the target domain by transferring the user preference derived from the source domain. Traditional CDR studies follow the embedding and mapping (EMCDR) paradigm, which transfers user representations from the source to target domain by learning a user-shared mapping function, neglecting the user-specific preference. Recent CDR studies attempt to learn user-specific mapping functions in meta-learning paradigm, which regards each user's CDR as an individual task, but neglects the preference correlations among users, limiting the beneficial information for user representations. Moreover, both of the paradigms neglect the explicit user-item interactions from both domains during the mapping process. To address the above issues, this paper proposes a novel CDR framework with neural process (NP), termed as CDRNP. Particularly, it develops the meta-learning paradigm to leverage user-specific preference, and further introduces a stochastic process by NP to capture the preference correlations among the overlapping and cold-start users, thus generating more powerful mapping functions by mapping the user-specific preference and common preference correlations to a predictive probability distribution. In addition, we also introduce a preference remainer to enhance the common preference from the overlapping users, and finally devises an adaptive conditional decoder with preference modulation to make prediction for cold-start users with items in the target domain. Experimental results demonstrate that CDRNP outperforms previous SOTA methods in three real-world CDR scenarios.
- [10] arXiv:2404.14851 (replaced) [pdf, ps, html, other]
-
Title: From Matching to Generation: A Survey on Generative Information RetrievalSubjects: Information Retrieval (cs.IR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Information Retrieval (IR) systems are crucial tools for users to access information, widely applied in scenarios like search engines, question answering, and recommendation systems. Traditional IR methods, based on similarity matching to return ranked lists of documents, have been reliable means of information acquisition, dominating the IR field for years. With the advancement of pre-trained language models, generative information retrieval (GenIR) has emerged as a novel paradigm, gaining increasing attention in recent years. Currently, research in GenIR can be categorized into two aspects: generative document retrieval (GR) and reliable response generation. GR leverages the generative model's parameters for memorizing documents, enabling retrieval by directly generating relevant document identifiers without explicit indexing. Reliable response generation, on the other hand, employs language models to directly generate the information users seek, breaking the limitations of traditional IR in terms of document granularity and relevance matching, offering more flexibility, efficiency, and creativity, thus better meeting practical needs. This paper aims to systematically review the latest research progress in GenIR. We will summarize the advancements in GR regarding model training, document identifier, incremental learning, downstream tasks adaptation, multi-modal GR and generative recommendation, as well as progress in reliable response generation in aspects of internal knowledge memorization, external knowledge augmentation, generating response with citations and personal information assistant. We also review the evaluation, challenges and future prospects in GenIR systems. This review aims to offer a comprehensive reference for researchers in the GenIR field, encouraging further development in this area.
- [11] arXiv:2401.00020 (replaced) [pdf, ps, other]
-
Title: ShennongAlpha: an AI-driven sharing and collaboration platform for intelligent curation, acquisition, and translation of natural medicinal material knowledgeComments: 53 pages, 6 figures, 10 supplementary figures, 2 supplementary tablesSubjects: Artificial Intelligence (cs.AI); Databases (cs.DB); Information Retrieval (cs.IR)
Natural Medicinal Materials (NMMs) have a long history of global clinical applications and a wealth of records and knowledge. Although NMMs are a major source for drug discovery and clinical application, the utilization and sharing of NMM knowledge face crucial challenges, including the standardized description of critical information, efficient curation and acquisition, and language barriers. To address these, we developed ShennongAlpha, an AI-driven sharing and collaboration platform for intelligent knowledge curation, acquisition, and translation. For standardized knowledge curation, the platform introduced a Systematic Nomenclature to enable accurate differentiation and identification of NMMs. More than fourteen thousand Chinese NMMs have been curated into the platform along with their knowledge. Furthermore, the platform pioneered chat-based knowledge acquisition, standardized machine translation, and collaborative knowledge updating. Together, our study represents the first major advance in leveraging AI to empower NMM knowledge sharing, which not only marks a novel application of AI for Science, but also will significantly benefit the global biomedical, pharmaceutical, physician, and patient communities.