SciELO - Scientific Electronic Library Online

 
vol.24 issue2Scheduling variable processing time operations in systems with buffersHeuristic Sensitivity Analysis for Baker's Yeast Model Parameters author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Investigação Operacional

Print version ISSN 0874-5161

Inv. Op. vol.24 no.2 Lisboa Dec. 2004

 

Graph-Based Structures for the Market Baskets Analysis

 

Luís Cavique †  

† ESCS-IPL / IST-UTL Portugal

lcavique@escs.ipl.pt

 

Abstract:

The market basket is defined as an itemset bought together by a customer on a single visit to a store. The market basket analysis is a powerful tool for the implementation of cross-selling strategies. Although some algorithms can find the market basket, they can be inefficient in computational time. The aim of this paper is to present a faster algorithm for the market basket analysis using data-condensed structures. In this innovative approach, the condensed data is obtained by transforming the market basket problem in a maximum-weighted clique problem. Firstly, the input data set is transformed into a graph-based structure and then the maximum-weighted clique problem is solved using a meta-heuristic approach in order to find the most frequent itemsets. The computational results show accurate solutions with reduced computational times.

Keywords: data mining, market basket, similarity measures, maximum clique problem.

 

Texto completo apenas disponível em PDF.

Full text only in PDF.

 

References

1. C.C. Aggarwal, J.L. Wolf and P.S. Yu, A New Method for Similarity Indexing of Market Basket Data, in Proceedings of the 1999 ACM SIGMOD Conference, Philadelphia PA, 1999, pp.407-418.         [ Links ]

2. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A. Verkamo, Fast Discovery of Association Rules, in Advances in Knowledge and Data Mining, U.M. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy Eds, MIT Press, 1996.         [ Links ]

3. E. Balas, W. Niehaus, Optimized Crossover-Based Genetic Algorithms will be the Maximum Cardinality and Maximum Weight Clique Problems, Journal of Heuristics, Kluwer Academic Publishers, 4, 1998, pp. 107-122.         [ Links ]

4. C. Berge, Graphs, 3rd edition, North-Holland, 1991.         [ Links ]

5. M. Berry and G. Linoff, Data Mining Techniques for Marketing, Sales and Customer Support, John Wiley and Sons, 1997.         [ Links ]

6. I.M. Bomze, M. Budinich, P.M. Pardalos and M. Pelillo, Maximum Clique Problem, in Handbook of Combinatorial Optimization, D.-Z. Du and P.M. Pardalos Eds, 1999, pp.1-74.         [ Links ]

7. S. Brin, R. Motwani, J.D. Ullman and S.Tsur, Dynamic Itemset Counting and Implication Rules for Market Basket Data, in Proceedings of the 1997 ACM SIGMOD Conference, Tucson, Arizona, 1997, pp.255-264.         [ Links ]

8. M. Cardoso, I. Themido and F. Moura Pires, Evaluating a Clustering Solution: an Application in the Tourism Market, Intelligent Data Analysis, 3, 1999, pp. 491-510.         [ Links ]

9. L. Cavique and I Themido, A New Algorithm for the Market Basket Analysis, Internal Report CESUR-IST, UTL, Portugal, 2001.         [ Links ]

10. L. Cavique, C. Rego and I. Themido (a), A Scatter Search Algorithm for the Maximum Clique Problem, in Essays and Surveys in Meta-heuristics, C. Ribeiro and P. Hansen Eds, Kluwer Academic Publishers, 2002, pp. 227-244.         [ Links ]

11. L. Cavique, C. Rego and I. Themido (b), Estruturas de Vizinhança e Procura Local para o Problema da Clique Máxima, Revista de Investigação Operacional, 2002, vol.22, pp. 1-18.         [ Links ]

12. L. Cavique, Meta-heurísticas na Resolução do Problema da Clique Máxima e Aplicação na Determinação do Cabaz de Compras, dissertação de Doutoramento em Engenharia de Sistemas no Instituto Superior Técnico da Universidade Técnica de Lisboa, 2002.         [ Links ]

13. N. Chen, A. Chen, L. Zou and L. Lu, A Graph-based Clustering Algorithm in Large Transaction Databases, Intelligent Data Analysis, 5, 2002, pp.327-338.         [ Links ]

14. T.A. Feo, M.G.C. Resende and S.H. Smith, A Greedy Randomized Adaptive Search Procedure for Maximum Independent Set, Operations Research, 42, 1994, pp. 860-878.         [ Links ]

15. M.R. Garey and D.S. Johnson, Computers and Intractability: a Guide to the Theory of NP-Completeness, W.H. Freeman and Company, New York, 1979.         [ Links ]

16. A.M. Hughes, Strategic Database Marketing, McGraw-Hill, 2000.         [ Links ]

17. A. Jagota, L. Sanchis, R. Ganesan, Approximately Solving Maximum Clique using Neural Network Related Heuristics, in Clique, Coloring and Satisfiability, Second Implementation Challenge DIMACS, D.S. Johnson and M.A. Trick Eds, AMS, 1996, pp. 169-203.         [ Links ]

18. B. Jeudy and J.-F. Boulicaut, Optimization of Association Rule Mining Queries, Intelligent Data Analysis, 6, 2002, pp.341-357.         [ Links ]

19. D.S. Johnson and M.A. Trick Eds, Clique, Coloring and Satisfiability, Second Implementation Challenge DIMACS, AMS, 1996.         [ Links ]

20. W. Lin, S.A. Alvarez and C. Ruiz, Efficient Adaptive-Support Association Rule Mining for Recommender Systems, Data Mining and Knowledge Discovery, 6, Kluwer Academic Publishers, 2002, pp.83-105.         [ Links ]

21. F. Liu, Z. Lu and S. Lu, Mining Association Rules Using Clustering, Intelligent Data Analysis, 5, 2001, pp.309-326.         [ Links ]

22. G. Salton and M.J. McGill, Introduction to Modern Information Retrieval, McGraw-Hill, 1983.         [ Links ]

23. SAS Software, Enterprise Miner Documentation, SAS Institute, 2000.         [ Links ]

24. P. Soriano and M. Gendreau, Tabu Search Algorithms for the Maximum Clique, in Clique, Coloring and Satisfiability, Second Implementation Challenge DIMACS, D.S. Johnson and M.A. Trick Eds, AMS, 1996, pp. 221-242.         [ Links ]