Symposium on Human-Computer Interaction HCITALY 2003 Torino
Transcript
Symposium on Human-Computer Interaction HCITALY 2003 Torino
Symposium on Human-Computer Interaction HCITALY 2003 Torino, November 4 − 5, 2003 Liliana Ardissono and Anna Goy (Eds.) RT 75/2003 Dipartimento di Informatica Università di Torino Corso Svizzera 185 10149 Torino Italy Volume Editors Liliana Ardissono Dipartimento di Informatica Università di Torino Corso Svizzera 185 I-10149 Torino Italy Anna Goy Dipartimento di Informatica Università di Torino Corso Svizzera 185 I-10149 Torino Italy Table of Contents Preface 1 Invited Talk: From Adaptive Hypermedia to the Adaptive Web and beyond Peter Brusilovsky 3 Invited Talk: The Power of Experience Design Strtegy as a Way of Creating Imaginable Features Jan-Christoph Zoels 5 An Environment for Designing and Developing Multi-Platform Interactive Applications Silvia Berti, Giulio Mori, Fabio Paternò and Carmen Santoro 7 Multimodal Directing in New Media Design Letizia Bollini 13 Comparing Accessibility Evaluation Tools: Results from a Case Study Giorgio Brajnik 17 Adaptive Management of the Answering Process for a Call Center System Federica Cena and Ilaria Torre 23 User Interaction with an Automated Solver. The Case of a Mission Planner Amedeo Cesta, Gabriella Cortellessa, Angelo Oddi and Nicola Policella 29 Usability Evaluation of Tools for Nomadic Application Development Cristina Chesta, Carmen Santoro and Fabio Paternò 36 Navigating 3D Worlds by Following Virtual Guides: Helping both Users and Developers of 3D Web Sites Luca Chittaro, Roberto Ranon and Lucio Ieronutti 42 Software Environments that Support end-User Development Maria Francesca Costabile, Daniela Fogli, Giuseppe Fresta, Rosa Lanzilotti, Piero Mussio and Antonio Piccinno 48 Cyrano: a Character-Centered Architecture for Interactive Presentations Rossana Damiano, Vincenzo Lombardo, Francesca Biral and Antonio Pizzo 53 Improving Recommendations by Integrating Collaborative Filtering and Supervised Learning Techniques 60 Marco Degemmis, Stefano Paolo Guida, Pasquale Lops, Giovanni Semeraro and Maria Francesca Costabile Evidences for a Prototypical Organization of Websites’ Page Layout Francesco Di Nocera, Corinne Capponi and Fabio Ferlazzo 67 Evaluation Methodologies and User Involvement in User Modeling and Adaptive Systems Cristina Gena 74 Fluidtime Time Information Interfaces for Public Services Michael Kieslinger 79 UPoet: Poetic Interaction with a 3D Agent Michele Lamarca, Fabio Zambetta, Fabio De Felice, Fabio Abbattista 84 The Elettra Virtual Collaboratory: a CSCW System for Scientific Experiments with Ultra-Bright Light Sources Roberto Ranon, Luca Chittaro and Roberto Pugliese 89 Visual Arts Web Spaces Perception and Exploration Styles Paolo Raviolo 94 Automatic Lightness and Color Adjustment of Visual Interfaces Alessandro Rizzi, Carlo Gatta, Monica Maggiore, Eugenio Agnelli, Daniele Ferrari, Davide Negri 109 Developing Affective Lexical Resources Alessandro Valitutti, Oliviero Stock and Carlo Strapparava 112 SAMIR: A 3D Web Intelligent Interface Fabio Zambetta, Graziano Catucci, Fabio Abbattista, Giovanni Semeraro, Michele Lamarca, Fabio De Felice 118 Preface Human-Computer Interaction (HCI) is the research area that studies methodologies and techniques for the design and development of usable and reliable interactive systems supporting human activities. The 2nd Symposium on Human-Computer Interaction, held in association with Virtuality 2003 and MIMOS 2003, has been organized as a forum in which researchers coming from the academy and from the industry may present and discuss their experiences in the design, development and evaluation of User Interfaces for automated systems. The present volume includes a selection of the papers submitted to HCITALY 2003, which will be presented during the symposium in Torino. Moreover, the volume includes the abstract of two Invited Lectures: the first one is by Peter Brusilovsky (University of Pittsburgh), one of the most representative figures in the Adaptive Hypermedia research. The second one is by Jan-Christoph Zoels (Interaction Design Institute of Ivrea), leading figure in the interaction design area. We are sure that the workshop represents a great opportunity for disseminating information about the most recent Italian research in the HCI area. Liliana Ardissono and Anna Goy 1 2 From Adaptive Hypermedia to the Adaptive Web and beyond Peter Brusilovsky Department of Information Science and Telecommunications School of Information Sciences University of Pittsburgh, 135 North Bellefield Avenue Pittsburgh [email protected] Web systems suffer from an inability to satisfy heterogeneous needs of many users. A remedy for the negative effects of the traditional “one-size-fits-all” approach is to develop systems with an ability to adapt their behavior to the goals, tasks, interests, and other features of individual users and groups of users. Adaptive Web is a relatively young research area. Started in with a few pioneering works on adaptive hypertext in early 1990, it now attracts many researchers from different communities such as hypertext, user modeling, maching learning, natural language generation, information retrieval, intelligent tutoring systems, cognitive science, and Web-based education. Currently, the established application areas of adaptive Web systems are education, information retrieval, and kiosk-style information systems. A number of more recent projects are also exploring new application areas such as e-commerce, medicine, and tourism. While research-level systems constitute the majority of adaptive Web systems, a few successful industrial systems show the commercial potential of the field. This talk will review a number of adaptation techniques that have been developed and evaluated in the field of adaptive hypermedia and applied in adaptive Web systems. It will also present several examples of adaptive Web systems in different application areas. To provide a connection with the virtual reality focus of the conference the talk will specially address the issue of developing adaptive systems for virtual environments “beyond” the classic hypermedia and the Web. It will discuss the challenges of implementing adaptive navigation support and adaptive presentation in virtual reality context. 3 4 The Power of Experience - Design Strategy as a Way of Creating Imaginable Futures Jan-Christoph Zoels Interaction Design Institute Ivrea [email protected] The talk explores how user-centered design research methods can strengthen a foresight and innovation process by enhancing scenarios of the future with the visual, the spatial and the experiential. The study reflects on a design foresight initiative, Project F: fabric care futures, that Whirlpool Europe, a leading manufacturer of domestic appliances, carried out as a multidisciplinary effort to use design strategy and user experience research in shaping its business policy for the next ten years. The design-ethnographic research revealed a great deal of information about the complexity of domestic life and uncovered attitudes about consumers? images of self, home, family and friends, as well as design and product preferences. This information led not only to new concept products, which guided the strategic planning process, but also informed the current product development. Moreover, the user-centered and contextual design approach fostered a change in strategy and communication for the company. Results from this project show that by using design to create tangible representations of future product solutions the company was able to stimulate interest, buy-in and support internally, as well as to open up a more sustainable dialogue with all the stakeholders involved in its foresight strategy and decision-making. 5 6 An Environment for Designing and Developing MultiPlatform Interactive Applications Silvia Berti, Giulio Mori, Fabio Paternò, Carmen Santoro ISTI-CNR, Via G.Moruzzi, 1 56124 Pisa, Italy {s.berti, g.mori, f.paterno, c.santoro}@isti.cnr.it Abstract. The increasing availability of new types of interaction platforms raises the need for new methods and tools to support development of nomadic applications. This paper presents a solution based on the use of multiple levels of abstractions and a number of transformations able to obtain concrete interfaces, taking into account the different platforms and preserving usability. 1 Introduction With the advent of the wireless Internet and the rapidly expanding market of smart devices, designing interactive applications supporting multiple platforms has become a difficult issue. In fact, on the one hand the decreasing cost at which the devices are now offered has enabled an increasing variety of people to become potential users of features and services of novel generations of communication technology as never before. Moreover, the increasing use of such devices in mobile settings has also stimulated a growing interest in exploring the unique opportunities offered by multimodal applications (graphic and/or vocal), to the aim of providing users with as much flexibility as possible and produce more natural and faster operations. On the other hand, rarely such a high number of flourishing range of opportunities offered have become effective, due to the low quality of the user interfaces provided to the users. The main problem is that many assumptions that have been held up to now about classical stationary desktop systems are being challenged when moving towards nomadic applications, which are applications that can be accessed through multiple devices from different locations. Consequently, one fundamental issue is how to support software designers and developers in building such applications: in particular, there is a need for novel methods and tools able to support development of interactive software systems able to adapt to different targets while preserving usability. Model-based approaches [1] could represent a feasible solution for addressing such issues: the basic idea is to identify useful abstractions highlighting the main aspects that should be considered when designing effective interactive applications. In particular, the capability to use task models to describe the logical activities that should be supported by an interactive system in this heterogeneous context is an important starting point to overcome the limitations of the current approaches. 7 In the paper we first describe the basic concepts of our method. Then, we move on describing the TERESA tool that has been designed and implemented to support this approach. Lastly, some examples of application of the method to a virtual museum have been provided. 2 The Method Our method [2] for model-based design is composed of a number of steps (see Figure 1) that allows designers to start with an envisioned overall task model of a nomadic application and then derive concrete and effective user interfaces for multiple devices. Fig. 1. The transformations supported by our method The main steps are: x High level task modelling: the output of this phase consists of the description of the logical activities that need to be performed in order to reach the users’ goals. This description initially considers an integrated task model where all the activities that have to be supported have been specified. Next, the task model is refined and structured so as to identify the activities that have been supported for each platform considered. x Abstract user interface (AUI): in this phase the focus shifts to the interaction objects supporting task performance. An abstract user interface is defined in 8 terms of presentations, identifying the set of user interface elements perceivable at the same time, and each presentation is composed of a number of interactors [3], which are abstract interaction objects identified in terms of their main semantics effects. An XML-based language has been specified in order to describe the organisation of the various interactors within the presentations. The structure of the presentation is defined in terms of elementary interactors characterised in terms of the task they support, and their composition operators. Such operators are classified according to the communication goals to achieve: a) Grouping: indicates a set of interface elements logically connected to each other; b) Relation: highlights a one-to-many relation among some elements, one element has some effects on a set of elements; c) Ordering: some kind of ordering among a set of elements can be highlighted; d) Hierarchy: different levels of importance can be defined among a set of elements. x User interface generation. This phase is completely platform-dependent and has to consider the specific properties of the target device. The interactors are mapped into interaction techniques supported by the particular device configuration considered (operating system, toolkit, etc.), and the also the operators defined in the language for abstract user interface are implemented with appropriate presentation techniques. 3 The TERESA Tool TERESA [4] is a transformation-based environment designed and developed by our group at ISTI-CNR. It is intended to provide a complete semi-automatic environment supporting a number of transformations useful for designers to build and analyse their design at different abstraction levels and consequently generate the concrete user interface for a specific type of platform. A number of main requirements have driven the design and development of TERESA: x Mixed initiative; we want a tool able to support different level of automations ranging from completely automatic solutions to highly interactive solutions where designers can tailor or even radically change the solutions proposed by the tool. x Model-based, the variety of platforms increasingly available can be better handled through some abstractions that allow designers to have a logical view of the activities to support. x XML-based, XML-based languages have been proposed for every type of domain. In the field of interactive systems there have been a few proposals that partially capture the key aspects to be addressed. x Top-down, this approach is an example of forward engineering. Various abstraction levels are considered, and we support cases when designers have to start from scratch. So, they first have to create more logical descriptions, and then move on to more concrete representations until the final system. 9 x Different entry-points, our approach aims to be comprehensive and to support the entire task/platform taxonomy. However, there can be cases where only a part of it needs to be supported. x Web-oriented, the Web is everywhere, and so we decided that in order to be general Web applications should be our first target. However, the approach can be easily extended to other environments (such as Java applications, Microsoft environments, …) because only the last transformation needs to be modified for this purpose. The TERESA tool offers a number of transformations between different levels of abstractions and provide designers with an easy-to-use integrated environment for generating both XHTML and VoiceXML user interfaces [5]. With the TERESA tool, at each abstraction level the designer is in the position of modifying the representations while the tool keeps maintaining forward and backward the relationships with the other levels thanks to a number of automatic features that have been implemented (e.g. the possibility of links between abstract interaction objects and the corresponding tasks in the task model so that designers can immediately identify their relations). This results in a great advantage for designers in maintaining a unique overall picture of the system, with an increased consistence among the user interfaces generated for the different devices and consequent improved usability for end-users. 4 Generating Multiplatform User Interfaces In this section we show some examples of user interfaces derived by applying the described approach to a museum application. Desktop System Cellphone VoiceXML-Enabled Phone System: “The ‘Boat’ has been achieved through the subtle divisions of the planes enveloping its central part, which is only rough-hewn; the material is white marble.” (Five second pause) “Remember that if you would like to return to the main menu, say ‘home’ or if you would like to go back to the previous menu, say ‘back’.” Table 1. Presentations of artwork information on different platforms. More specifically, in Table 1 the presentations refer to a situation in which the user has selected information about a specific artwork of Modern Sculpture section. As you can note, there are some differences concerning the presentation of the 10 artwork selected: on the desktop system the picture of the artwork is shown, together with additional information (title, description, artist, material, ..).; on the cellphone the picture is provided as well but the textual information is more concise; in a VoiceXML-enabled system a vocal description of the artwork is provided. It is worth pointing out that on the different platforms also the navigation mechanisms change: on both desktop system and cellphone some links have been (visually) presented, although they differ in number. In fact, on the desktop system there is enough room for displaying also links to other sections of the museum, whereas on the cellphone the choice is more limited. In the third case (VoiceXML) the navigation is implemented by dialogs that have been vocally provided. Desktop System Cellphone VoiceXML-Enabled System System: (grouping sound) The artworks contained in the section are the following: ‘Boat’, ‘Totem’, ‘Hole’. If you would like information on one of these please say its name (grouping sound ) Remember that if you would like to return to the main menu, say home. Table 2. Implementation of the grouping operator on different platforms Another interesting point is represented by the different implementations of the operators of the abstract user interface on the various platforms. In Table 2 the different implementations of the grouping composition operator on the various platforms have been shown: in the first two presentations it is obtained through a list of graphical buttons (1st case) or pull down menu (2nd case) whereas in the vocal interface (3rd case) there are sounds that delimit the grouped elements. 5 Conclusions In this paper a model-based approach for designing and developing multiplatform applications has been presented, and the main features of the related TERESA tool have been outlined. Some examples of presentations generated have been provided as well, highlighting the capability of the approach to provide usable multimedia interfaces for a broader set of mobile devices, including vocal interaction techniques. The tool is publicly available at http://giove.cnuce.cnr.it/teresa.html. This work has been supported by the IST V Framework CAMELEON (Context Aware Modelling for Enabling and Leveraging Effective interaction) project. More information is available at http://giove.cnuce.cnr.it/cameleon.html. 11 References 1. Puerta, A., Eisenstein, J., Towards a General Computational Framework for Model-based Interface Development Systems, Proceedings ACM IUI’99, pp.171-178. 2. F.Paternò, C.Santoro, “A Unified Method for Designing Interactive Systems Adaptable to Mobile and Stationary Platforms”, Interacting with Computers, Vol.15, N.3, pp 347-364, Elsevier, 2003. 3. Paternò, F., Leonardi, A., 1994. A Semantics-based Approach to the Design and Implementation of Interaction Objects, Computer Graphics Forum, Blackwell Publisher, Vol.13, N.3, 195-204. 4. G. Mori, F. Paternò, C. Santoro, “Tool Support for Designing Nomadic Applications”, Proceedings ACM IUI’03, Miami, pp.141-148, ACM Press. 5. S. Berti, F. Paternò, Model-based Design of Speech Interfaces, Proceedings DSV-IS 2003, Springer Verlag . 12 Multimodal Directing in New Media Design Letizia Bollini1 1 Letizia Bollini, Phd, Lecturer at Scienze di Internet, Università di Bologna Dip. Scienze dell'Informazione, Mura Anteo Zamboni 7 40127 Bologna, Italy [email protected] Abstract. The research about multimodal directing provides a theoretical ground to the improvement of hermeneutic and design methods. The starting point of this research is to define new-media as communication tools structured around hypertextual links. Many communication channels convey simoultaneously information to the user perception. Every single channel must work as a co-operating element with the other modes of communication within a complex system. As stated in the approach of cognitive psychology and relating to the metaphore of the multimedial authoring software of Canter, HCI is very similar to a theatre drama. The multimodal directing as a discipline of new-media design uses and reshapes knowledge previously produced within its own field, while constantly fostering a critical knowledge on professional practice 1 Interfacce Multimodali: dal multimediale al multimodale Il concetto di interfacce multimodali emerge nell’ambito della progettazione di comunicazione visiva in un momento di forte transizione, divenendo lo spunto e l’occasione per un ripensamento ed una ridefinizione del campo disciplinare stesso. L’introduzione delle tecnologie informatiche nel mondo delle visual design ha prodotto diverse innovazioni, sia nel sistema produttivo (con le tecnologie di Desktop Publishing) sia creando nuovi settori merceologici (editoria elettronica, web, etc.). Ciò apre prospettive inedite nella ricerca di nuovi linguaggi che siano in grado di sfruttare le potenzialità dei supporti informatici, non come semplice travaso di una cultura consolidata verso un ambito ancora da esplorare, ma come ricerca originale e sperimentale di nuove e appropriate soluzioni espressive. Uno dei problemi che rendono difficoltosa la lettura teorico-critica del fenomeno “new media” è imputabile proprio alla banalizzazione a cui il termine è stato sottoposto. La sbavatura semantica nasce già nella scelta denominativa che fonde due termini generici new e media il cui significato non è stabilito a priori, ma solo in maniera relativa all’uso e alla combinazione. 13 Nel dibattito attuale, infatti, si parla di “new media” o, correntemente di “multimedia” come sinonimo di “hypermedia”, termine coniato da Ted Nelson [6] per indicare i “programmi e applicazioni che includono una varietà di media (come testo, immagini, video, audio e animazioni) la cui presentazione è controllata interattivamente dall’utente”. Gli aspetti fondamentali individuati da questa definizione sono tre: il supporto tecnologico, i registri di comunicazione, l’interazione dell’utente con un sistema di notazione e lettura non lineare. Eppure nell’uso del termine “media” rimane una sottile ed irrisolta ambiguità. Come fa notare Papert [7], la caratteristica principale del computer come “medium” è la capacità di simulazione degli altri media, la capacità, cioè, di integrare in un unico supporto più registri mediali. Il computer può essere visto in un’ottica ambivalente sia come “strumento” che come “modo” [8]. Questo cambio di prospettiva permette di delineare due poli su cui orientare la ricerca: il versante dei “supporti strumentali”, cioè dei “media” e quello dei “linguaggi”, dei modi di comunicare. Parleremo allora non tanto di “multimedia”, ma piuttosto, di un unico “medium” che costituisce una piattaforma integrata tra sistema notazionale e più canali di comunicazione, o meglio, tra più modalità di comunicazione. Multimodalità, quindi, significa sfruttare contemporaneamente canali di comunicazione diversi e complementari permettendo un’interazione tra computer e utente che rispecchia la ricchezza fisica e cognitiva dell’interlocutore umano. 2 Registica multimodale Con il testo Computer as a theatre la Laurel [5] affronta la questione dal punti di vista dell’esperienza e interpreta l’interazione tramite GUI secondo la metafora del teatro. Guardare al computer non come a uno strumento (“tool’), ma come ad un mezzo (“medium”) significa ribaltare il paradigma interpretativo. Questo porta inevitabilmente notevoli ricadute nel campo della progettazione di interfacce che non sono più considerate come una semplice superficie di contatto tra noi e la macchina bensì sono mezzi che contribuiscono alla definizione dell’esperienza dell’utente e che ne devono soddisfare i processi sensoriali, cognitivi ed emozionali [10]. La visione dunque è più ampia e globale. Non basta risolvere le singole sequenze procedurali, ma bisogno piuttosto considerarle in maniera integrata rispetto alla globalità dell’esperienza percettiva. La metafora del teatro sembra avere ulteriori ambiti di validità come “metafora del fare”. Marc Canter partendo dalla sua personalissima sperimentazione artistica, utilizza questa stessa similitudine nell’interfaccia del di authoring. Director è l’interprete più fedele della metafora del teatro, non tanto come chiave interpretativa dell’interazione che ha luogo nell’interfaccia, ma della progettazione stessa dell’interfaccia. Il modello metaforico e operativo è quello della “messa in scena” teatrale. Sarà il progettista, orchestrando e dirigendo le singole presenze e le loro performance a dare all’utente (uno spettatore non passivo) la chiave di lettura dell’esperienza interattiva. [9] 14 Se la metafora del teatro interpreta in maniera esaustiva l’interazione uomocomputer si può supporre di estenderla non solo dal momento della fruizione a quello della produzione, ma anche a quello dell’ideazione del sistema interfaccia. Il compito del designer sembra analogo, non tanto a quello tradizionalmente del grafico, ma piuttosto a quello del regista: recepisce il lavoro dello sceneggiatore, a cui collabora, lo elabora nella forma del linguaggio cinematografico, teatrale, etc. coordina e sovraintende alle riprese ed agli allestimenti, alla registrazione del suononoro, al montaggio, etc. [11] La piu significativa peculiarità che emerge dal fatto di doversi confrontare e gestire una pluralità di canali comunicativi, caratterizzati di una loro precisa logica e da una loro specifica sintassi, è la necessità di una progettazione che, come nel lavoro registico, sia in grado di amalgamare ogni singola componente così da integrarla con le altre dando vita ad un artefatto sistemico. Per meglio interpretare questa specificità della registica multimodale si può allora ricorrere alla definizione introdotta da Julia Kristeva del concetto di intertestualità come trasposizione di uno o più sistemi di segni in un’altro accompagnata da una nuova articolazione della posizione enunciativa e denotativa. Concetto assolutamente fondamentale nell’ambito dei new-media e della intersezione dei linguaggi dei multipli canali di comunicazione che la loro progettazione comporta: ognuno dei canali di comunicazione è caratterizzato da un proprio linguaggio, da una peculiare forma di rappresentazione, non trasponile, per semplice traduzione, ad un canale all’altro. Riarticolare progettualmente il messaggio e la sua estensione da un canale all’altro implica un atto linguistico di produzione di significato coerente con il messaggio globale e non solo con il proprio lessico e la propria sintassi.[3] L’analisi della metafora della rappresentazione scenica e delle sue evoluzioni e l’introduzione del concetto di ambiente, secondo l’ottica multimodale di approccio ai new-media, ci ha permesso di passare dal concetto di pagina e di messa in pagina, secondo una concezione ancora grafico/visiva e verbocentrica della progettazione, a quello di messa in scena di un sistema interattivo “Sulla scia delle innovazioni tecnologiche, cinetismo e fluidità diventano caratteri sempre più peculiari in “nuovi ambienti virtuali spazio-temporali in cui l’originaria struttura ipertestuale discontinua si trasforma in un ambiente continuo dentro al quale sia possibile una registica intenzionale degli ingressi in scena degli attori [...] Si passa cioè dal paradigma di pagina a quello di ambiente [...] un evento -luogo che è la quintessenza del modello anaforico e ostensivo”. [1] Il designer dovrebbe dunque essere un regista che organizza una performance che verrà attivata in una condizione differita spazio-temporale dall’utente. Chiameremo pertanto “registica multimodale” la disciplina del design, in questo contesto applicata ai new media, le cui caratteristiche peculiari sono la progettazione di un ambiente multimodale, la progettazione secondo uno schema intertestuale delle varie modalità espressive, la presenza di una struttura organizzata per nodi ipertestuali che si attiva in presenza della figura co-autoriale dell’utente. [2] 15 References 1. Anceschi, G.: “La fatica del web”. Il Verri, n. 16, (2001) 23-45 2. Bollini, L.: Multimodal Directing in New-Media. PhD Thesis, Politecnico di Milano, (2001) 3. Carlini, F.: Lo stile del web. Einaudi, Milano (2001) 4. Kristeva, J.: La révolution du language poétique. Seuil, Paris, (1974) 5. Laurel, B.: Computer as a theatre. Addison and Wesley, New Yok, (1984) 6. Nelson, T.: Computer Lib- dream machines. Tempus Books/Microsoft Press, Redmond. Wash, (1987) 7. Papert, S.: Mindstorms: Children, Computers and Powerful Ideas. Basic Book, New York, (1980) 8. Polillo R.,: Il design dell’interazione, in Anceschi G. (a cura di): “Il design delle interfacce”. Domus Academy, Milano (1993) 9. Randall, P., Jordan, K.: Multimedia from wagner to virtual reality. Norton & Company, New York (2001) 10. Shedroff, N.: Experience design. New riders Indianapolis (2001) 11. Toschi, L.: Il linguaggi dei nuovi media. Apogeo, Milano (2002) 16 Comparing accessibility evaluation tools: results from a case study Giorgio Brajnik? Dip. di Matematica e Informatica Università di Udine, Italy [email protected] Abstract. The paper claims that effectiveness of automatic tools for evaluating website accessibility has to be itself evaluated given the increasingly important role that these tools play. The paper presents a comparison method for a pair of tools that considers the tool correctness, completeness and specificity in supporting the task of assessing the conformance of a website with respect to established guidelines. The paper presents data acquired during a case study based on comparing LIFT Machine with Bobby. The data acquired from the case study is used to assess the strengths and weaknesses of the comparison method. The conclusion is that even though there is room for improvement of the method, it is already capable of providing accurate and reliable conclusions. 1 Introduction An accessible website is a site that can be perceived, operated and understood by persons despite their congenital or induced disabilities [11, 9]. I argued elsewhere [3] that web accessibility is just one facet of web quality of use, and that quality of use is, together with utility, visibility and credibility one of the pillars upon which the website success depends. A website that falls short in any of these properties severely hinders its success. One of the claims in [3] is that unless automatic webtesting tools are deployed in the normal processes of design, development and maintenance of websites, the quality of websites is unlikely to improve. This is due to a unfavorable combination of factors related with the dynamics of these processes and the nature of websites. Very short release cycles, lack of resources (time, persons, money), incomplete and vague specifications, rapidly evolving technologies, complexity of designing information and interaction architectures, ease of use of powerful authoring tools (like Dreamweaver or Flash) are the main reasons why web developers do not pay too much attention to the quality level of websites. Only if (at least some) quality factors are treated in an automatic way, so that the computer deals with the most repetitive and trivial details, the developer can devote time to learn and focus on more important quality issues. ? Scientific advisor for UsableNet Inc. 17 At the moment there are several tools for testing accessibility of websites (see an appropriate page at W3C [12]). They differ on several dimensions, ranging from functionalities (testing vs. fixing), to form of supported interaction (online service vs. desktop application integrated in authoring tools), effectiveness, reliability, cost, etc. Some of the tools have been in the arena for several years (for example Bobby, that was initially developed by CAST and freely available; now it has been acquired by a commercial company and it is being sold). It is very important to be able to evaluate the quality of these tools, as they play the key role of enabling an average web developer to develop better websites. Only if reasonable evaluation methods are defined and used then the quality of these tools can be assessed in a relatively standard way. These evaluations will improve the ease with which a web developer could compare different tools and perform the appropriate choice. Furthermore these evaluations could stiffen the competition between tool manufacturers, and in the end improve the tools themselves. The goal of this paper is to illustrate a method for comparing different tools that is: useful to pinpoint strengths and weaknesses of tools in terms of their effectiveness, viable in the sense that the method can be carried out with limited resources, and repeatable in the sense that independent applications of the method to the same tools should lead to similar results. These properties of the method are demonstrated, when possible, by results derived from a case study using LIFT Machine (a multi-user web accessibility testing system developed by Usablenet – version 1.3) and Bobby (a desktop application for testing web accessibility acquired and engineered by Watchfire – version 4, “Bobby Core WorldWide”).1 A longer version of this paper [4] contains all the data and a detailed description of the steps performed when applying the method to the case study. 2 Related work A thorough analysis was conducted by Thatcher [10]. Unfortunately such an evaluation was only temporarily posted on his website before being taken offline. His evaluation (of 6 accessibility testing tools) was aimed at determining the cost/benefit ratio and helping potential customers to select the most appropriate tool. In addition to considering costs, availability, and accuracy, the evaluation scope was quality of use in general. The method he used is heavily based on manual and systematic inspection of the results produced by the tools on selected test pages, and is therefore less generally applicable than the method proposed 1 Comparing these tools is a little unfair given their scope, flexibility, power and price: LIFT Machine is targeted toward an enterprise-level quality assurance team and whose price starts at $6000; Bobby 4.0 was available for free (now it runs at about $250) and is targeted toward a single individual wanting to test a relatively limited number of pages. Nevertheless the comparison is useful as a case study for demonstrating the evaluation method itself. 18 in this paper, as it requires carefully prepared test files and a long analysis of the results. A survey of the web usability properties that are (or could be) tested automatically has been illustrated in [1]. Ivory and Hearst propose in [7] a taxonomy of usability evaluation methods. A CHI workshop was devoted to discuss and compare the capabilities of guideline–based tools and model–based tools [5]. Another more general and abstract view of how web testing tools can be evaluated is presented in [8]. In neither of these cases however the problem of designing an appropriate method for assessing effectiveness of tools was tackled. A tool validation method is proposed in [2], that is based on the indirect feedback that a web developer provides to a testing tool. The testing tool can detect (within certain limits) that in two consecutive evaluations of the same website, a problem that was highlighted in the first one disappeared from the second one. If we assume that the web developer has fixed that problem and that the fixing has been prompted/suggested/guided by the warnings issued by the tool, then this is an indirect way for measuring the utility of the tool. The biggest limit of this method is that it is based on the behavior of the web developer who might decide to fix a problem only because the tool suggested to do it, rather than because it is recognized as being an accessibility issue. 3 Relevant questions The important questions related with effectiveness of tools can be framed around these basic concepts: – how complete is a tool in covering the relevant aspects to be detected (i.e. the measure of how many accessibility defects present in the website are caught and correctly shown to the user), – how correct is it (i.e. the proportion of problems reported by the tool that are indeed true problems), and – how specific is it (i.e. the number of different possible issues that can be detected and described by a tool). Completeness and correctness are both necessary for characterizing effectiveness. A complete (but incorrect) tool could simply flag every page with all sorts of problems, generating therefore a large number of false positives, i.e. statements of detected problems that are not true. Conversely, an incomplete but correct tool could issue a problem if and only if an IMG tag has no associated ALT attribute. No false positives are generated by such a tool, but there are many other possible accessibility defects that are uncaught. Such a tool generates a large number of false negatives: true problems that are not caught. Completeness is a difficult property to characterize operationally. In fact it requires to know in advance which are the true problems. Since operational definitions of completeness are difficult (they require to know in advance the set of possible defects of a website), an approximate concept is being used in the 19 method described in this paper that approximates the set of true problems with the set of issues generated by any of the tools. Correctness is easier to deal with. False positives cannot be avoided for accessibility (unless we set for a very low level of completeness). In fact many accessibility problems deal with perception and interpretation of information, and in few cases only these aspects can be made fully explicit and formally characterizable. For example, the guideline that says “14: Ensure that documents are clear and simple” obviously cannot be tested automatically. In these cases tools can use some model of text readability (e.g. the Gunning-Fog measure [6]) and support the decision–making process of the user who has to determine if the page uses a simple enough language. The role of the tool in this case is to highlight certain features of the website so that an appropriate decision can be performed by the user upon further investigation. The ratio of manual tests with respect to the available tests within a tool is one way to characterize the correctness: a tool based only on manual tests would raise many issues, many of them could have been filtered away without bothering the user, i.e. are false positives. In addition to completeness and correctness, I introduce also the specificity property. The specificity of a tool reflects in the level of detail that the tool is able to use when describing a potential problem. If a tool raises only very general concerns about a page (for example, it warns the user with a message like “the page contains non-textual elements with no text equivalent”) then the specificity of the warnings that are issued is too little with respect to the amount of details that is needed by the web developer to understand the problem, diagnose it (i.e. to find the root causes) and to plan its solution. A tool suggesting that an image does not have an ALT and that the correct ALT should be the empty string (i.e. ALT="") because the image is a spacer, is more useful than a tool simply saying that the image requires an ALT. 3.1 Results of the case study As a summary of the results for the case study, after generating more than 46,000 issues about 5 different websites, and after randomly selecting 305 of them, and manually determining which ones were false positives (FP), which ones where false negatives (FN), we can state that: – correctness: • with probability 99% LIFT Machine generates less than 8% of false positives when using automatic tests only (less than 23% otherwise) • with probability 99% Bobby generates less than 20% of false positives when using automatic tests only (less than 31% otherwise) • the claim that LIFT Machine generates less false positives than Bobby is true with probability 99.91% if we consider only automatic tests, and 97.4% otherwise. – completeness: 20 • Bobby covers all the 46 checkpoints while LIFT Machine covers 45 of them • 30% of Bobby’s tests are automatic, while 49% of LIFT Machine tests are automatic • with probability 99% LIFT Machine generates less than 21% of false negatives when using automatic tests only (less than 15% otherwise) • with probability 99% Bobby generates less than 31% of false negatives when using automatic tests only (less than 22% otherwise) • the claim that LIFT Machine generates less false negatives than Bobby is true with probability 99.95% if we consider only automatic tests, and more than 99.99% otherwise. – specificity: • Bobby implements 70 tests while LIFT Machine has 103 tests • for the 6 most populated checkpoints (i.e. those with the largest number of tests that implement them) LIFT Machine has a number of automatic tests that is equal or greater than the corresponding number for Bobby (up to 50% more tests for some of the checkpoints) • for the 6 most populated checkpoints LIFT Machine has a number of (automatic and manual) tests that is equal or greater than the corresponding number for Bobby (up to 350% more tests for some of the checkpoints). 4 Conclusions The comparison method described in the paper is viable, is capable of producing reasonable data concerned with correctness, completeness and specificity of the tools being compared, is capable to supporting direct comparison statements and of supporting estimates of the false positives and negatives of the two tools. It can therefore be effectively used in comparisons of pair of tools. An interesting step for refining the method would be to use it, in a group of persons, to run a larger-scale comparison (i.e. with more websites, a larger sample of issues) perhaps with some additional steps aimed at cross-validating the classification. In such a context it should also be possible to test the repeatability of the comparison method, a property that cannot be demonstrated on the single case study reported in this paper. References 1. G. Brajnik. Automatic web usability evaluation: what needs to be done? In Proc. Human Factors and the Web, 6th Conference, Austin TX, June 2000. http://www.dimi.uniud.it/∼giorgio/papers/hfweb00.html. 2. G. Brajnik. Towards valid quality models for websites. In Proc. Human Factors and the Web, 7th Conference, Madison, WI, 2001. http://www.dimi.uniud.it/∼giorgio/papers/hfweb01.html. 21 3. G. Brajnik. Using automatic tools in accessibility and usability assurance processes. Unpublished note; http://www.dimi.uniud.it/∼giorgio/papers/assuranceproc-note/index.html, Nov 2002. 4. G. Brajnik. Comparing accessibility evaluation tools: a method for tool effectiveness. Unpublished note; http://www.dimi.uniud.it/∼giorgio/papers/evalmethod/index.html, Sept 2003. 5. T. Brink and E. Hofer. Automatically evaluating web usability. CHI 2002 Workshop, April 2002. 6. R. Gunning. The techniques of clear writing. McGraw-Hill, 1968. 7. M. Ivory and M. Hearst. The state of the art in automated usability evaluation of user interfaces. ACM Computing Surveys, 4(33):173–197, Dec 2001. 8. D. Scapin, C. Leulier, J. Vanderdonckt, C. Mariage, C. Bastien, C. Farenc, P. Palanque, and B. R. Towards automated testing of web usability guidelines. In Proc. Human Factors and the Web, 6th Conference, Austin, TX, June 2000. http://www.tri.sbc.com/hfweb/scapin/Scapin.html. 9. J. Slatin and S. Rush. Maximum Accessibility: Making Your Web Site More Usable for Everyone. Addison-Wesley, 2003. 10. J. Thatcher. Evaluation and repair tools. Used to be posted on http://www.jimthatcher.com, June 2002. No more available. 11. J. Thatcher, C. Waddell, S. Henry, S. Swierenga, M. Urban, M. Burks, B. Regan, and P. Bohman. Constructing Accessible Web Sites. Glasshouse, 2002. 12. W3C Web Accessibility Initiative. Evaluation, repair, and transformation tools for web content accessibility. http://www.w3.org/WAI/ER/existingtools.html. 22 Adaptive Management of the Answering Process for a Call Center System Federica Cena and Ilaria Torre Department of Computer Sciences – University of Torino Corso Svizzera 185 - 10149 Torino (Italy) [email protected], [email protected] Abstract. This paper describes the development of an Adaptive Call Center which personalizes the management of the answering process. The system is composed by an adaptive response system with a speech recognition engine and by an operator support structure, which is dynamically involved in the answer when the sentence is not recognized or the question belongs to those classified as complex. In this case the call is routed to the operator which best fits the caller features. 1 Introduction Over the last few years the world of Call Center Systems is being involved in a deep transformation. Moving from the evidence of cost differential between live contact handling and automated transactions1 , many companies (but also non-profit organizations) are switching from fully operator-based solutions to self-service ones. IVR (Interactive Voice Response) and web contact are the main self-service tools. Besides reducing costs of work and costs of training, self-service allows also to create a common way of presentation and to extend the time of access to the entire day. Anyway, self-service cannot entirely substitute the role of human agents: it can automate some operations, lighten human agents’ work, but it difficultly succeeds in managing complex inquiries and special cases, in facing unexpected situations and in overcoming the resistance of users toward non-usual and non-human interactions. Given that, a solution which i) deploys IVRs, particularly with automatic speech recognition – ASR -, and furthermore ii) integrates IVR events within an operator support structure, seems very interesting: it allows companies to combine the advantages of flexible self-services with those of employing human agents, supporting the user incrementally. Indeed the initial interaction and the management of mechanical operations are automated and operators are dynamically drawn in for accomplishing complex operations or in case of misunderstanding or problems with the ASR. 1 Live contact handlings run from $3 to $10 per contact and up, while IVR (Interactive Voice Response) transactions generally cost tens of cents and Web contacts may be mere pennies [1]. 23 However, the mentioned solution is still not satisfactory: speech-enabled-IVRs offer standardized conversations, they are not as flexible as human operators and they cannot cope with the differences between users and their different needs (e.g. disabled people, novice users, etc.); moreover, unless it is well integrated, the step of switching the call to an operator can be experienced as a system failure and can decrease the trust of users, especially if the final answering process is not successful. The project described in this paper aims at finding out a solution for solving these problems, inside a speech-recognition-enabled IVR structure, collaboratively supported by an operator-based structure. The idea is that of using adaptivity to manage the workflow of the answering process. In particular, for the goal of integration, it will have to be applied to both the main components, that means: Personalizing the interaction with the automatic voice response system, Routing of the call toward the operator that best fits the caller needs, when the first solution is not applicable or was not successful. The advantages coming from this approach can be summarized in the following list: (i) efficiency: using a language fitting the user features (IVR) and/or routing the call toward an operator fitting the user features (Routing), decreases the possibility of user’s misunderstanding (s ee [3] for a comparison on a different approach) and, consequently, increases the probability user’s problems are solved and shortens calls. (ii) positive experience of the user: a positive and friendly interaction, which satisfies the user requests, and improves over time, makes the interaction enjoyable, (iii) decreased frustration of the operator: on the one hand, (s)he can leave the boring and repeated operations to IVR and, on the other hand, the adaptive routing allows her/him to manage situations that are adequate for her/his competences and skills. • • 2 Architecture of the system and flow of an inbound call The main components of a typical Call Center infrastructure running on standard PC servers are represented by a Gateway, which interconnects the Call Center applications with the Public Switched Telephone Network (PSTN), and a Communication Server, which provides the basic functionalities for IP Telephony. In figure 1 they are not included, in order to focus the schema on the logical architecture that carries out the personalized interaction with the user (caller). As it can be seen in the figure, the core of our application is the Response Manager toward which the Communication server routes the traffic. It controls the flow of the call and the dialog among the modules which manage: i) the Voice User Interface - VUI -, composed by the Automatic Speech Recognition engine, in charge of understanding natural language user input, and the Response Generation agent, in charge of prompting, providing menus and answering to the user; ii) the routing of the call, accomplished by the Routing agent. Whenever a call is received, the system checks the calling number. If it is not recognized, the user is taken into account as a new one, otherwise, there are two options: the caller is a customer or (s)he is using the phone of another customer. 24 (inbound call) ? Caller V U I (automatic and adaptive response) Speech Recognition Engine Grammars Response Generation Agent Response Manager (user data required for answering) (calls data) KB 1 (adaptive routing of the call) Operator User model Routing Agent Operator model KB 2 User Data Operational Systems Fig. 1 – Architecture for managing the personalized interaction with the caller. To manage this second possibility, in order to avoid wrong forms of personalization, the system asks if the calling number is the usual one. Consequently, there are two types of dialog interactions: standard, for non-customer callers; personalized, for customer callers, based on her/his model. During the interaction, the user makes her/his request, which is understood by the Speech Recognition Engine (using a keyword spotting technique) and analyzed by the Response Manager. If the request is identified as a simple one, the IVR is charged for supplying the service requested and the Response Generation Agent produces the answer according to the features of the caller. The TextToSpeech Engine, inside of it, will generate the final output. If the request is complex, or the ASR engine does not recognize the user input, the Response Manager switches the call to the Routing Agent. As it will be described in the following, this agent routes the call toward an operator, taking into account both caller’s and operator’s models. 3 Personalized Automatic voice response In this section, and in the next one, we will focus on the components which couple voice response systems and routing systems with adaptation rules, in order to carry out the adaptive management of the call. Our prototype regards the Call Center of a bank, thus the adaptation rules, the features of the users models and also the grammars for automatic speech recognition are specific for this application domain. As seen, the VUI of the system includes two kinds of complementary technologies: speech input and speech output. Similarly to a GUI, personalizing a VUI can regard the adaptation of the contents, of the form of presentation and of the 25 structure of navigation, which, for an IVR, means the menu commands. Given the goals discussed in the introduction, we limited the adaptation to the first two aspects. To accomplish that, we integrated the automatic response system with an agent2 that decides which is the right answer for the specific caller, on the basis of a set of rules (see the Knowledge Base - KB1 -, in the figure). Then the system uses TextToSpeech for dynamic generation of words that are specific and variable (e.g. user name, user features, data from DB, etc.), adding them to previously recorded message chunks (e.g. welcome formulas, the questions to ask, pieces of phrases for asking details about the user question, etc.). The reason is that people do not like TextToSpeech because of its unnatural prosody and poor intelligibility (see the experience of British Telecom [5]). U SER M ODELING. As it can be seen in figure 1, the Response Generation agent accomplishes its task inquiring the model of the caller. The system stores a user model for each caller, building and updating it on the basis of the Customer DB of the bank. The model is structured in a set of dimensions. Both the Response Generation agent and the Routing agent access the same user model, but use different dimensions. In particular, those used by the Response Generation agent are the following: age, skill level (ability in the use of the application, which basically depends on the number of calls), knowledge of the domain (which is deduced with secondary inferences from the user’s school level, job and kind of question), satisfaction (inferred from the lack of complaints and problems during the previous interactions), cost (it is related with the time subtracted to others calls and it is also a monetary cost if the number is free for the caller. It is estimated on the basis of the number of calls to the Call Center in a period of time). A customer DB stores the basic data, typically coming from the operational systems of the bank, and data regarding calls. The user model is processed starting from these data and according with the above-sketched criteria. The dimensions depending on calls are updated at each interaction, while the others are updated periodically. PERSONALIZATION. According to usability researches [2], and the preliminary tests we performed on the use of IVR vs. traditional call centers, we concluded that the typical forms human operators modify their dialog with the caller are varying the detail level of explanation, adding some hints, and changing the way and the style of utterance. We used the following techniques [4] to reproduce this behavior: • Variation of the amount of information (text addition or removal): if the caller is a novice in the use of the application, we add some hints to the system prompts (in the form of answer’s examples), hypnotizing that (s)he probably needs help. • Variation of the type of the information provided (changing a part of text according to the user’s features): a) prompt variant, we produce different versions of the same prompt on the basis of user’s cost (if high, the system answers are shorter to save time), of user’s satisfaction (if low, the system answers are longer as they contain incentives) and of user’s experience (if the user is very expert, we replace generic prompts with her/his last request to advance her/his needs and avoid to annoy her/him); 2 In the figure it corresponds to some modules inside the Response Generation Agent 26 b) adaptive natural-language generation: we formulate phrases with a degree of complexity which varies on the basis of the user’s levels of knowledge of domain; • Style variation: generation of natural-language sentences characterized by different levels of formalism according to the user’s age. 4 Adaptive Routing The switch of the call from the IVR to the operator (when it is necessary) has to be experienced as a soft and natural transition, namely as an additional service and not as a deficiency. For this reason it is important to route the call toward an operator which fits the user features and keeps on a registry of interaction homogeneous with the previous one. Common Call Routing systems route the calls following techniques like FIFO (arrival order) or considering some kinds of priority on groups of operators; instead, the requirement for an Adaptive Routing System is that of selecting the operator that is more adequate for answering a specific call of a specific customer. To implement that, we integrated the routing system with an agent which, starting from caller’s features, and taking into account the models of all the operators, selects the operator that is best fitting and is idle at that time. The criteria for the match are defined in a set of business rules, stored in a KB (see the Knowledge Base – KB2 -, in the figure). U SER M ODELING. As explained, in this second step of the answering process, the models taken into account and matched are those of the caller and of the operators. As regards the first one, we have already briefly described its structure and here we just mention the specific features taken into account by the Adaptive Routing agent: value (customer’s profitability for the bank, estimated from data such as account balance, volume of operations, etc.), risk of churn (probability that the customer closes her/his account/s), region of birth, knowledge, satisfaction, age, gender. The first two dimensions of the model are processed according to some formulas provided by experts (the marketing CEO, in our prototype). As regards the operator, the system stores a model for each one in the Call Center and updates their data periodically through evaluation tests. The main dimensions of the operator models are: skill, knowledge, area of birth, communicative ability, expertise, age, gender. As clear, these features rarely change. PERSONALIZATION. A set of business rules specifies the match caller/operator. Some of them are simple correspondences between users’ features (e.g. age, region of birth) with some aggregations of ranges of the operators’ features. E.g. “same area of birth” is due to the fact that people are pleased when the unknown voice which answers is somehow familiar to them (e.g., immediately perceivable elements are dialectal inflection, age and scholarship). Other rules are combined evaluations of utility, which consider, for each caller, his/her economic value, her/his performance required in the response (inferred by the frequency of operations and her/his expertise) and the risk of churn. For the operator, the dimensions taken into account are the skill level, the ability of communication and the rate and speed of well ended answers. Anyway, 27 in any case, the agent must have domain knowledge higher than the caller and expertise and communication ability adapted to manage risk of churn. On the basis of these rules, the inference engine dynamically assigns a score to each agents’ characteristic. Consequently, the call is routed to the operator with the higher score if (s)he is idle, otherwise to the one with second higher score and so on. 5 Conclusions The paper presents a prototype for the adaptive management of calls toward a Call Center system, integrating technologies from different fields. Important applications can be built upon this infrastructure. Just for example we can cite those for disabled people and for CRM purposes. A natural evolution of this system is to apply adaptivity also to the agent’s screen, to supply an adaptive support to manage calls. Implementation remarks. The goal we pursued was to develop modules which could be integrated into commercial CTIs in an open architecture. For the prototype, we implemented our agents on Cisco platform -Customer Response Application v3-, which is supported by IP networks and runs in a Java environment. Others components are: ICM (Cisco Intelligent Contact Manager), which routes the call; TTS Nuance server, that translates text into voice, ASR Nuance server, which contains the voice recognition engine (based on GSL language) and JESS shell to implement the routing agent. References [1] Bocklund L., Bengtson D.: Call Center Technology Demystified, CallCenter Press ICMI Inc., Maryland, 2002. [2] Halpern E.: Human factors and Voice Application, in VoiceXMLReview, Vol.1, 2001. [3] Horvitz E., Peak T.: Harnessing Models of Users’ Goals to Mediate Clarification Dialog in Spoken Language Systems, in LNCS 2109, 2001, pp. 3-13. [4] Kobsa A., Koenemann J, Pohl W.: Personalized Hypermedia Presentation Techinques for improving Online Customer Relationship, in The Knowledge Engineering Review, 2001, pp. 111-155. [5] Stentiford, F.W.M. and Popay, P.A.., The design and evaluation of dialogues for interactive voice response services, in BT Technology Journal, Vol. 17 No. 1, January 1999; and in Insight Interactive, December 2000. 28 User Interaction with an Automated Solver The Case of a Mission Planner Amedeo Cesta, Gabriella Cortellessa, Angelo Oddi and Nicola Policella PST@ISTC-CNR Planning & Scheduling Team ? Institute for Cognitive Science and Technology Viale K. Marx 15, I-00137 Rome, Italy {cesta|corte|oddi|policella}@ip.rm.cnr.it http://pst.ip.rm.cnr.it Abstract. This paper describes the interaction module of a system, named Mexar, developed to support human mission planners in the Mars Express program. Mexar addresses the Mars Express Memory Dumping Problem (Mex-Mdp) a problem that requires continuous attention during the mission operations. The interactive environment of Mexar helps mission planners to analyze the current problem and take scheduling decisions as result of an interactive process enhanced by different and sophisticated facilities. Elaborate interactive techniques have been integrated to address three different aspects: (a) develop user’s trust in the automated algorithms; (b) guarantee to the user the possibility to express her own preferences during the whole problem solving process; (c) promote a deep participation of the user in the problem solving process. Mexar has been successfully delivered to the mission planners in May 2002. The design and use of advanced experimental methodologies to measure the quality and the performance of the operational tool are part of our current research. 1 Introduction State of the art interactive technology could be very useful in supporting human tasks, and its application may vary from simple systems able to support daily human activities, to more complex and sophisticated decision support tools (e.g. in the context of medical environment, transportation domain, space missions) that help human decision makers in taking important and difficult decisions. Unfortunately, very often, the automated tools present deficiencies and shortcomings resulting in a non effective use or in the user not to trust them. There are at least a couple of reasons for this mistrust of the automated systems: – Most of the interactive devices are endowed with a “bad designed”interface, that does not present information properly so to satisfy user’s needs. A user ? This work describes results obtained in the framework of a research study conducted for the European Space Agency (ESA-ESOC) under contract No.14709/00/D/IM. 29 tends to be skeptical toward the use of a black box, that hardly explains choices, actions and results. An appropriate and effective representation of the model of the problem, all the objects and the entities involved and the solutions or the advices proposed by the artificial aid, should be guaranteed, in order to promote a more fluid and flexible interaction. – The naturally conservative behavior of people in changing their habits makes difficult the spread of such a supporting technology, in particular in those contexts in which critical decisions are to be taken. The final user of systems of the kind is used to make decisive choices and perform complex tasks completely “by hand” and the attempt to automate decisions or jobs it is not a trivial problem. Users tend not to abandon their traditional way of working and get into new habits, unless, we believe, this entails, higher quality, higher speed in obtaining outcomes, more stimuli, less annoyance, certainty of the correctness of the results and above all the possibility to actively participate being in charge of the final decisions. For these reasons the design of interactive systems is a hard and challenging problem and the success of their use is also dependent on an effective and useful interaction module. The interaction environment should guarantee a friendly and comprehensible representation of the problem, the problem solving process and the solutions, especially in those cases in which the automated system is devoted to support a user in solving complex problems. It should allow a user to verify the correctness of the results, the possibility to express her own preferences, and a continuum in changing her way of working, by allowing a gradual adaptation to the innovation. In this paper we present our experience in developing a tool for supporting the user in the context of a space mission, reporting on the work done to design and realize an intelligent interface able to represent in a compact and meaningful way a huge amount of complex information and to solve an involved problem. The reminder of the paper is organized as follows: Section 2 shortly introduces the problem addressed, Section 3 describes the interactive techniques conceived and implemented within Mexar by the design of two different environments (the Problem Analyzer and the Solution Explorer, described respectively in 3.1 and 3.2). Section 3.3 provides also some comments from the final users and highlights the need of a more advanced and accurate experimentation of the tool. Some conclusive comments end the paper. 2 A Tool for Solving a Complex Problem Mars Express is a space probe launched by the European Space Agency (ESA) on June 2, 2003 that will be orbiting around Mars starting from the beginning of 2004 for two years. It contains eight different scientific payloads that will gather different data on both surface and atmosphere of the Red Planet. During the operational phase around Mars a team of people, the Mission Planners, will be responsible for deciding the on board operations of Mars Express. Any single operation of a payload, named POR from Payload Operation Request, is decided well in advance through a negotiation phase among the different actors involved 30 in the process (e.g., scientists, mission planners, FD experts). The result of this negotiation is (a) acceptance or rejection of a single POR, (b) a start time assignment of the accepted PORs. At the operational level the mission planners will be responsible for data return to scientists for any single POR. Their goal will be to guarantee an acceptable turnover time from the end of the execution of the POR to the availability on Earth of data generated by that POR. Data produced by any POR, are indeed stored in the on-board memory, and need to be transmitted to Earth. Within the Mars Express mission we investigated the problem of the downlinking of the spacecraft on-board memory, by automating a task that so far is performed by a human mission planner, and developing an interactive tool to support the mission planner that leaves the humans in charge of their responsibilities. The result of our study is a system called Mexar that addresses the Mex-Mdp in Mars Express. Briefly, a Mex-Mdp problem consists in synthesizing the commands for the satellite in order to download data from the spacecraft to Earth during the visibility downlink windows. 1 The main goal is to flexibly manage the memory banks 2 , avoiding losses of data and assuring good availability times of data on Earth. The Mex-Mdp problem is described in detail in [4], where a description of the conceived approach to solve it is also shown. In particular different solving algorithms have been developed that work on a CSP representation of Mex-Mdp: a greedy heuristic, a randomized algorithm and a local search procedure. Mexar integrates those solving techniques to create an interactive support system that mission planners could use in deciding policies for downlinking data to Earth. The reminder of the paper shows the interactive functionalities that have been designed and implemented within Mexar. 3 Interactive Techniques in MEXAR In the design of interactive services for Mexar we kept in mind three main issues, that in our opinion are relevant to the success of the interaction: – visualization problem, that is the exigency to guarantee a certain level of “transparency” to the user, providing comprehensible and significant representations of the real domain, the problem, its solutions and the problem solving process. These needs are also related to the mentioned general skeptical attitude of the users toward automated systems. In order to gain user’s trust an automated system should be endowed with clear and expressive representation services. We refer to this as to the glass box principle (see [2]), contrasting the widespread trend, among the users, to consider the automated systems as a black box of which to be distrustful or suspicious. – personalizing the interaction, to allow a user to choose her own interaction modality and express her preferences. Users, in fact, are different in needs, 1 2 Time intervals during which it is possible to downlink data to Earth. The on board memory is subdivided into different memory banks named packet stores. 31 and even the same user could desire to interact in different ways. In providing different and alternative representations within Mexar we paid attention to design an interaction modality quite close to the traditional way of working of the human planner. In this way a user can count on a system that facilitates her task by providing information and solutions close to the way she is used to, and get gradually acquainted with the alternative interaction modalities. – interactive participation of the user , when a user is acquainted with the system, by using this “close and familiar” interaction modality, she might be more confident in using different and more powerful facilities that might immerse her more deeply into the interactive problem solving. This means that she could either decide to simply accept the solution proposed by the automated solver, or initiate a more complex interactive process during which contribute to the construction of a solution. Mexar is endowed with a sophisticated interaction module that allows a user to easily supervise and control the entities of the domain and the whole solving process, being aware of all the steps the solver goes through. Specialized interactive functionalities guarantee the user a certain level of information and control on what is going on. These interactive techniques are grouped together in the Problem Analyzer (PA) module. The PA puts at user’s disposal the possibility to inspect the problem and obtain an initial solution by choosing among different solving algorithms she can personalize. A “friendly” graphic interface abstracts the complexity of the problem representation and the technicality of the solving algorithms by providing a high level description of the problem and the algorithms’ features. Through the graphic representation a user can easily choose and configure the solving algorithm tuning the requested parameters according to her preferences. Moreover an advanced graphic environment named Solution Explorer (SE) allows an expert user to select the best solution for the execution as result of a “step by step” procedure enhanced by incremental improvement algorithms, evaluation services, and graphic comparison functionalities. In the following we provide a description of the way in which those ideas have been implemented within the system and describe features of the two main modules the Mexar interaction environment is composed of. 3.1 The Problem Analyzer In Figure 1(a) the basic layout of the Problem Analyzer is shown. It provides different representations of the problem features and allows a user to choose which one focusing on. A “traditional representation” of the problem provides a detailed description very close to the one the human mission planner is used to. It shows the PORs list in textual form, specifying in details the related information. An “alternative” graphic representation provides an additional description of the input activities (PORs) and their distribution on the payloads timelines. For each payload a different timeline is shown where each POR is represented by a colored rectangle labeled with a natural number and starting from its start time; the used capacity of the packet-stores is also provided, that contains a graphic 32 (a) Examining problem features (b) Studying a solution Fig. 1. An example of alternative interaction modalities in Mexar view of the temporal function representing the volume of stored data and the packet store capacity. The alternative graphic view represents a more compact, intuitive and high level vision of the problem and the solution. A user can either focus on one or use both the modalities. The two representations are indeed linked and synchronized to each other through a set of interactive links. Once the CSP representation for the problem (described in [3]) is instantiated and used for showing information on the interaction panel, a user can choose among different strategies in order to solve it, personalizing her choices by tuning the parameters of the algorithm. After the solver has worked the situation is shown to the user as in Figure 1(b). Alternative representations of the solution can be examined: (a) a solution table (traditional representation in the figure) is a data structure that reconstructs all the details concerning the solution of the current problem. Using the table it is possible to check how data from a single POR are segmented into different dump operations, how the data return time has been generated, etc. In general it could be also possible to directly generate the dump commands from the lines of the table and validate the results produced by the problem solver using different algorithms. In fact the whole table can be saved as a separate file and manipulated by different programs; (b) an alternative graphic view of the solution provides a more compact an qualitative vision of the same information. It is also possible to go further in the integration of this window with the visual features of the Problem Analyzer layout. Another feature is, indeed, the possibility to evaluate a solution according to some metrics. This possibility allows the user to easily estimate the quality of a solution with respect to some chosen parameters [3]. 3.2 The Solution Explorer Once the human planner has a deeper knowledge of the problem and all the aspects involved in, she can start a deeper level of interaction with the system trying to contribute with her expertise and judgment to the problem solving. 33 (a) An intermediate step in exploring the solutions space (b) A more advanced state in the exploration Fig. 2. Involving the user in the problems solving process In this way it is possible to choose either if completely entrust the system with the task of finding a solution or to participate more interactively in the problem solving process. As we said before the Problem Solver allows a user to apply different solving methods to the same problem. In addition specific functionalities allow the user to save different solutions for the same problem and to guide a search for improvements of the current best result by applying different optimization algorithms. The idea behind the Solution Explorer is one that an expert user could try to participate more deeply in the problem solving process. A user might generate an initial solution, save it, try to improve it by local search, save the results, try to improve it by local search with different tuning parameters and so on. This procedure can be repeated for different starting points, resulting in the generation of different paths in the search space. Using both the evaluation capability on a single solution and its own experience the user can visit the different solution series, all of them saved, and, at the end, choose the best candidate for execution. Figure 2 contains two examples of the Solution Explorer referred to a single problem at different stages of exploration. Studying the examples it is possible to see that our idea has been again one of facilitating the analysis of the current problem by providing multiple representations of the problem features. A user has different tools to evaluate the solutions and can either generate new ones or choose the best according to different temporary criteria. 3.3 MEXAR Current Status Mexar has been delivered to ESA-ESOC in May 2002 and is currently available to the mission planners. Users reactions to Mexar have been quite positive. We highlight in particular a real interest in the idea of using an automated tool that performs boring and repetitive tasks on their behalf, while preserving their 34 control on the flow of actions and the possibility to choose the final solution supported by the potentialities of the automated tool. Interactive systems represent, a very interesting attempt to take advantage of the complementary reasoning styles and computational strengths of both human and artificial solver. In particular the ideas behind the Solution Explorer provide a concept of human guided search (see also different approaches like [1]), that can be potentially very useful in the search for a solution to complex problems. One of our future aims is to subject our system to an advanced and accurate experimentation, in order to highlight possible deficiencies and problems and make Mexar an effective operational interactive tool. 4 Conclusions In this paper we described our experience in designing and developing an interaction environment for Mexar, an interactive tool devoted to solve a complex problem in the context of the Mars Express mission. Providing a useful and effective interface which allows an interactive problem solving shared between the user and the automated system represents a problem as challenging and arduous as developing efficient automated algorithms. Our system integrates automated techniques with sophisticated interactive functionalities in order to address the visualization problem to the user, provide her with the possibility to personalize the interaction and maintain the responsibility in deciding the final solution. Users reaction to Mexar tool has been quite encouraging though the design and application of a precise evaluation methodology represents a need we are currently working on. References 1. Anderson, D., Anderson, E., Lesh, N., Marks, J., Mirtich, B., Ratajczack, D., and Ryall, K. Human-Guided Simple Search. In Proceedings of the National Conference on Artificial Intelligence (AAAI 2000) (2000). 2. Cesta, A., Cortellessa, G., Oddi, A., and Policella, N. Interaction Services for Mission Planning in Mars Express. In Proceedings of the 3rd International NASA Workshop on Planning and Scheduling for Space, Houston, Texas, October 27-29 (2002). 3. Cesta, A., Oddi, A., Cortellessa, G., and Policella, N. Automating the Generation of Spacecraft Downlink Operations in Mars Express: Analysis, Algorithms and an Interactive Solution Aid. Tech. Rep. MEXAR-TR-02-10 (Project Final Report), ISTC-CNR [PST], Italian National Research Council, July 2002. 4. Oddi, A., Policella, N., Cesta, A., and Cortellessa, G. Generating High Quality Schedules for a Spacecraft Memory Downlink Problem. In Principles and Practice of Constraint Programming, 9th International Conference, CP 2003 (2003), Lecture Notes in Computer Science, Springer. 35 Usability Evaluation of Tools for Nomadic Application Development Cristina Chesta (1), Carmen Santoro (2), Fabio Paternò (2) (1) Motorola Electronics S.p.a. – GSG Italy Via Cardinal Massaia 83, 10147 Torino Italy [email protected] (2) I.S.T.I. - C.N.R. Via G. Moruzzi 1, 56100 Pisa Italy {c.santoro, f.paterno}@cnuce.cnr.it Abstract. Evaluating the usability of tools for development of nomadic applications introduces specific issues since it requires attention to both the interface of the tool itself, and the interface of the application produced through the tool. The paper addresses this problem by discussing the criteria and methodologies applied as well as the results obtained in an experimental activity on the subject. Introduction Several research activities have addressed design principles, methods and software development tools to increase systems usability while reducing development costs. Meanwhile a number of criteria and methodologies to evaluate the usability have been proposed. However, there is a lack of studies directed to evaluate the usability of the development tools, in particular when nomadic applications, which can be accessed through different types of interaction platforms, are considered. This requires a double point of view, on the tool itself and on the product realized through the tool. This problem is acquiring an increasing importance because many organizations have to develop new applications able to exploit the possibilities provided by mobile devices. This paper addresses such issues, by presenting some results of the evaluation activities carried on within the framework of the CAMELEON project in Motorola GSG Italy with the objective to assess the usability of the TERESA tool for design and development of multi-platform applications proposed by the HCI Group of I.S.T.I.-C.N.R. In particular, in the paper at first the TERESA tool is introduced and described, then the evaluation criteria adopted and the experiment organization are presented, finally some significant results followed by the concluding remarks are provided. 36 The TERESA tool TERESA is a transformation-based environment designed and developed at the HCI Group of I.S.T.I.-C.N.R. It is intended to provide a complete semi-automatic environment supporting a number of transformations useful for designers to build and analyse their design at different abstraction levels and consequently generate the concrete user interface for a specific type of platform. The abstraction levels considered are: the task model level, where the logical activities to support are identified; the abstract user interface, in this case the interaction objects (but classified in terms of their semantics, still independent from the actual implementation) are considered, and the concrete user interface (the actual corresponding code). The main transformations supported in TERESA are: • Presentation sets and transitions generation. From the specification of a ConcurTaskTrees [4] task model it is possible to obtain the set of tasks, which are enabled over the same period of time according to the constraints indicated in the model. Such sets, depending on the designer’s application of a number of heuristics supported by the tool, might be grouped together into a number of Presentation Task Sets (PTSs) and related Transitions among the various PTSs. • From task model -related information to abstract user interface. Both the task model specification and PTSs are the input for the transformation generating the associated abstract user interface, which will be described in terms of both its static structure (the “presentation” part, which is the set of interaction techniques perceivable by the user at a given time) and dynamic behaviour (the “dialogue” part, which indicates what interactions trigger a change of presentation and what the next presentation is). The structure of the presentation is defined in terms of elementary interactors characterised in terms of the task they support, and their composition operators. Such operators are classified in terms of the communication goals to achieve: a) Grouping: indicates a set of interface elements logically connected to each other; b) Relation: highlights a one-to-many relation among some elements, one element has some effects on a set of elements; c) Ordering: some kind of ordering among a set of elements can be highlighted; d) Hierarchy: different levels of importance can be defined among a set of elements. • Automatic abstract UI generation. Through this option the tool automatically generates the abstract UI for the target platform (instead of going through the two transformations above), starting with the currently loaded (single-platform) task model, and using a number of default configuration settings related to the user interface generation. • From abstract user interface to concrete user interface for the specific platform. This transformation starts with the abstract user interface and yields the related concrete user interface for the specific interaction platform selected. A number of parameters related to the customisation of the concrete user interface are made available to the designer. The tool generates XHTML and XHTML Mobile Profile according to the type of platform selected. 37 Evaluation Criteria and Methodology Starting with the ISO 9241-11 standard definition [1] and Shneiderman’s [3] and Nielsen’s [2] metrics, but considering the double perspective of the tool itself versus the product realized through the tool, we identified four aspects to be evaluated and eight related requirements as listed in Table 1. Table 1. Evaluation criteria Aspect Tool Interface Tool Functionalities Final Product Obtained with the Tool Approach Cost/Effectiveness Requirement Intuitiveness Learnability Completeness Developer satisfaction User Satisfaction Maintainability and Portability Development Efficiency Integrability The experimental evaluation has been conducted in parallel to the tool development in order to provide a formative rather than a summative evaluation. Two experiments have been designed in order to cover different aspects according to the criteria framework formerly exposed. Both of them refer to a common application scenario related to Business to Employee environment. Five subjects, selected within Motorola GSG Italy staff, were involved in the evaluation. All of them, within a range of different background and specialization, have technical knowledge and experience in software design and development, and are experienced computer users. They have been asked to participate in a 30 minutes preparation session and to dedicate 10 minutes reading the TERESA help prior to start the exercises. The first experiment focused on tool usability and functional coverage, with the objective to highlight potential weaknesses and to provide design recommendations useful while implementing subsequent versions of the TERESA tool. The experiment consisted in starting with a given task model created with CTTE 1.5.7 and obtaining the concrete user interface for both desktop and mobile phone using the version 1.1 of TERESA tool. The exercise goal was to realize a simple version of an e-desk application allowing three main actions: the registration to the service by inserting a username and a password, the selection of a location (workplace, home, travel or vacation), and the selection of an application from a menu. The applications offered are different in the desktop and in the mobile versions of the service. The actions to be performed, such as Generate Enabled Task Sets, Generate Abstract User Interface, etc. were predefined in order to require the access to every tool menu. For each step evaluators were asked to record any difficulties they may have encountered in achieving the goal and their suggestions to improve the user 38 interface. In addition, they were invited to provide comments about: approach, functionalities and result produced, reporting advantages/disadvantages with respect to traditional methods and providing indications on additional functionalities they would like to introduce. The first evaluation provided a number of suggestions that were considered while developing the new versions of TERESA, such as the possibility of links between abstract interaction objects and the corresponding tasks in the task model so that designers can immediately identify their relations. A second experiment has then been conducted in order to collect more information about developer satisfaction and cost/effectiveness of the approach. The experiment consisted in developing a prototype version of an e-Agenda application running on both desktop and mobile phone and including the following functionalities: visualization of the appointments of a single day; visualization of the details of each appointment; possibility of insert/modify/delete an appointment. This had to be realized in two ways: 1. At first using traditional techniques such as a template for the design phase and Microsoft Front Page or Netscape Composer for the implementation phase 2. Then using tool-supported techniques: CTTE 1.5.7 for task tree realization and version 1.5 of TERESA tool (updated taking into account the results of the first experiment) for XHTML and XHTML Mobile Profile pages generation. The evaluators have been required to collect quantitative metrics related to development efficiency, such as the total effort needed to complete the exercise expressed as creation or rework time and categorized by process phase, as well as the number of errors introduced. Moreover, they have been required to express their judgment on specific TERESA characteristics such as support offered to individuate the most suitable interaction techniques, support offered to compose interactors in the interface, and others aspects related to developer satisfaction and product maintainability/portability by a rating from 1 (poor) to 5 (very good). In case of negative evaluation they were invited to provide an explanation note and suggestions for improvement. Evaluation Results The evaluation resulted in an amount of data about the aspects considered. As for the first experiment the analysis has been conducted in two steps. Firstly the raw comments have been abstracted to recurrent issues aggregated with a functional criterion, counting the occurrences of each issue; this step has been conducted iteratively, in order to progressively obtain a clean taxonomy. In the second step the taxonomy obtained has been presented to the evaluators, who were requested to express for each of them a relevance assessment, either high, medium, or low; from this new data a relevance index has been synthesized for each item. The results of analysis have been reported to the development group, which integrated them in the new version of TERESA used for the second experiment. The new version 1.5 of TERESA has been substantially improved with respect to the first prototype. For example the effect of the heuristics used for combining together two or more PTS has been made more predictable, the AUI generation window has 39 been redesigned in order to be intuitive and usable, the Final User Interface Generation has been improved by the introduction of a preview windows, the task corresponding to an object can be automatically identified, and some model editing options have been introduced. The results of the second experiment show how developers’ productivity is affected by the use of the tool. Data about time performance have been collected in each phase of the experiment and summarized through average values. Results graphically illustrated in Fig. 1, show similar total times for the traditional and TERESA approaches, with opposite impact on different phases. The use of the tool almost doubled required time at design stage, while at development stage the results show a dramatically improved prototyping performance, reducing required time to half. This leaves a margin for further improvement, since the design time required by TERESA approach is expected to decrease as the subjects become more familiar with model-based techniques and notations. Moreover the slight total time increase is acceptable since it involves a trade-off with design overall quality: many subjects appreciated the benefits of a formal process and support to individuate the most suitable interaction techniques. For example designers reported satisfaction about how the tool supported the realization of a coherent page layout and identification of links between pages. The evaluators noticed and appreciated the improved structure of the presentations and more consistent look of the pages resulting from the model-based approach, as well as the reduced risk to forget the formal specifications. This is also coupled with an increased consistence between desktop and mobile version, pointed out by almost all the evaluators. 7:12 6:00 4:48 3:36 2:24 1:12 0:00 First version time Rework Total time First time version time Traditional approach Rework Total time time Teresa approach Fig. 1. Comparative results on time performance 40 Composition of time spent is also meaningful. The tool-supported methodology offers a very good support to fast prototyping, producing a first version of the interface in a significantly shorter time. On the other side rework time results increased. According to evaluators’ comments, root cause are the high level of automation of the tool, and the current unavailability of control options, which requires the developer to successively modify the interface produced in order to reach the desired results. Moreover, if changes and corrections are needed at the task model level, current prototype version of TERESA forces the user to a time-consuming iteration of the whole transformation process. The cost of corrections could be limited performing the platform filtering at a later stage of the process. Another reason to take into account is the greater familiarity of the subjects with traditional techniques than with model-based techniques and notations. Future refinements to TERESA and introduction of the tool in the software production process are then expected to consistently reduce rework time needed and to confirm the advantages of the proposed tool supported methodology. Conclusions and Acknowledgements In summary, TERESA emerged from the evaluation as an appealing and promising solution for designing and developing UIs on multiple and heterogeneous devices. Future refinements of the tool, are expected to consistently reduce the effort needed, and result in further improvement. At the same time the evaluation methodology and criteria we introduced appears to be general and applicable to different systems. Further activities will include additional experiments focusing on the final product and involving end users. We gratefully acknowledge support from the European commission through the CAMELEON IST project. We also would like to thank the colleagues Cristina Barbero, Simone Martini, Bianca Russillo and Massimiliano Fliri for participating to the experimental evaluation and for the useful discussions. References [1] ISO9241-11 Ergonomic requirement for office works with VDT’s – guidance on usability. Technical report, International Standard Organisation, 1991. [2] Nielsen, J. Usability Engineering, Morgan Kauffman, San Francisco, 1994. [3] Schneiderman, S. Designing the User Interface. Strategies for Effective HumanComputer Interaction. Addisson-Wessley, Reading (MA), third edition, 1998. [4] Paternò, F., Model-Based Design and Evaluation of Interactive Application. Springer Verlag, ISBN 1-85233-155-0, 1999. 41 # $ 0 ) 42 $ ( ! / . $ % # & ( ( ( + ' ( * & ( $ ' & $ - , $ " ) ! ! # ( ) ! ( ) ( ) ) ( ! ) % $ ) $ ' ) $ 43 44 " # $ 45 ! 46 [1] Brusilovsky, P. 2001. Adaptive hypermedia. User Modeling and User Adapted Interaction, 11, 1-2, 87-110. [2] Chittaro, L., and Ranon, R. 2002. Dynamic Generation of Personalized VRML Content: a General Approach and its Application to 3D E-Commerce. In Proceedings of Web3D 2002: 7th International Conference on 3D Web Technology, ACM Press, New York, 145-154. [3] Darken, R.P., and Sibert, J.L. 1996. Wayfinding Strategies and Behaviors in Large Virtual Worlds. In Proceedings of CHI '96, ACM Press, New York, 142149. [4] Latombe, J.C. 1991. Robot Motion Planning. Kluwer Academic Publisher, Boston, MA. [5] Lester, J., Converse, S.A., Stone, B. A., and Kahler, S. E. 1997. Animated Pedagogical Agents and Problem-Solving Effectiveness: A Large-Scale Empirical Evaluation. In Proceedings of the Eigth World Conference on Artificial Intelligence in Education, 23-30. [6] PŽruch, P., Vercher, J.-L. and Gauthier, G.-M. 1995. Acquisition of spatial knowledge through visual exploration of simulated environments. Ecological Psychology ,7, 1-20. [7] Ecological-Psychology. 1995; Vol 7(1): 1-20 [8] Rickel, J., and Lewis Johnson, W. 2000. Task-Oriented Collaboration with Embodied Agents in Virtual Worlds. In J. Cassell, J. Sullivan, and S. Prevost (Eds.), Embodied Conversational Agents. MIT Press, Boston. [9] Thorndyke, P.W., and Hayes-Roth, B. 1982. Differences in Spatial Knowledge Acquired from Maps and Navigation. Cognitive Psychology, 14, 560-589. [10] van Mulken, S., AndrŽ, E., and Muller, J. 1998. The Persona Effect: How Substantial is it? In Proceedings of HCIÕ98, Springer Verlag, Berlin, 53-66. [11] Vinson, N.G. 1999. Design Guidelines for Landmarks to Support Navigation in Virtual Environments. In Proceedings of CHI '99, ACM Press, New York, 278284. [12] Walczak, K., 2002. Building Database Applications of Virtual Reality with XVRML. In Proceedings of Web3D 2002: 7th International Conference on 3D Web Technology, ACM Press, New York, 111-120. 47 Software Environments that Support End-User Development M.F. Costabile1, D. Fogli2, G. Fresta3, R. Lanzilotti1, P. Mussio2, A. Piccinno1 1 Dipartimento di Informatica, Università di Bari, Bari, Italy {costabile, lanzilotti, piccinno}@di.uniba.it 2 Dipartimento di Elettronica per l’Automazione, Università di Brescia, Brescia, Italy {fogli, mussio}@ing.unibs.it 3 ISTI "A. Faedo", CNR, Pisa, Italy [email protected] Abstract. An important challenge for the coming years is to develop environments that allow people without particular background in programming to develop and tailor their own applications. The aim is empowering people to flexibly employ advanced information and communication technologies. In this direction the European Community has recently funded EUD-Net, a Network of Excellence on End-User Development (EUD). In this paper, we discuss a framework for EUD and present our current work to design environments that support the activities of a special category of end-users, called domain-expert users, with the objective of easing the way these users work with computers. 1 Introduction New technologies have created the potential to overcome the traditional separation between end-users and software developers. New environments able to seamlessly move from using software to programming (or tailoring) can be designed. Advanced techniques for developing applications can be used by individuals as well as by groups or social communities or organizations. Some studies say that by 2005, there will be in USA 55 millions of end-users compared to 2.75 millions of professional users [1]. End-users population is not uniform, but it includes people with different cultural, educational, training, and employment background, novice and experienced computer users, the very young and the elderly, people with different types of disabilities. Moreover, these users operate in various interaction contexts and scenarios of use and they want to exploit computer systems to improve their work, but often complain about the difficulties in the use of such systems. Based on the activity performed so far within the EUD-Net Network of Excellence, the following definition of EUD has been proposed: “End-User Development is a set of activities or techniques that allow people, who are non-professional developers, at some point to create or modify a software artifact”. EUD means the active participation of end-users in the software development process. In this perspective, tasks that are traditionally performed by professional software developers are transferred to the users, who need to be supported in performing these tasks. The user participation in the software development process can range from providing information about requirements, use cases and tasks, including participatory design, to end-user programming. Besides, the scientific community is doing a lot of effort in this direction [10] and some prototypal 48 software products, specific for EUD, are been developed: Topaz [15], AgentSheet [17], KidSim/Cocoa/Stagecast [8], and others [13]. Some EUD-oriented techniques have also been adopted by software for the mass market such as the adaptive menus in MS Word or some programming-by-example techniques in MS Excel. However, we are still quite far from their systematic adoption. Within the EUD-Net activity, the following research directions have been identified as fertile for allowing end-users to develop software: 1. theoretical and empirical studies of what problems addressed by software engineering transpose to EUD, why and how; 2. studies to identify possibly existing problems that are specific to EUD and are thus not addressed by software engineering; 3. research on methods and tools that would address the previously identified problems in ways that are adequate for end-users. Our proposal of designing environments, called Software Shaping Workshops, is in the direction of point 3 above. More specifically, we address the need of domain-expert users, i.e., professional people that are expert of specific application domain and want to use computer systems for their activities, but do not have expertise of computer science or programming. In the literature, others authors address the needs of domain-experts [12]. We propose a design methodology to develop software environments and tools, which supports such domain-expert users. It aims at overcoming the current difficulty of user-system interaction, being main reasons of difficulty the phenomena described in Section 2. 2 Difficulties in user-system interaction Several phenomena contribute to the current difficulties in user-system interaction. Some of these are described in the following: x Communicational gap between designers and users [14]. This phenomenon is related with the variety and complexity of the knowledge involved in interactive system design, which pose a serious problem of knowledge elicitation and sharing. The communicational gap arises from the fact that designers and users have different cultural backgrounds, and, the main consequence is that the interactive system usually reflects the culture, skill and articulatory abilities of the designer only. Thus users find often hurdles in mapping the interactive tools into their specific culture, skill and articulatory abilities. Users may be unable to follow their own solving strategies during the interaction process. x User diversity. As highlighted in [4], hurdles arise in designing interactive systems because of user diversity even within a same population. Such diversity depends not only on user skill, culture, knowledge, but also on specific abilities (physical and/or cognitive), tasks and context of activity. As a consequence, specialized user dialects stem from user diversity [6], rising from the existence of users subcommunities which develop peculiar abilities, knowledge and notations, e.g. for the execution of specialized subtasks. If, during system design, this phenomenon is not taken into account, some users may be forced to adopt specific dialects related with the domain but different from their own and possibly not fully understandable, making difficult the interaction process. x Co-evolution of systems and users [5]. It is well known that “using the system changes the users, and as they change they will use the system in new ways” [16]. These new uses of the system make the environment evolve, and force to adapt the 49 system to the evolved user and environment. This phenomenon is called coevolution of system, environment and users [3]. Designers are traditionally in charge of managing the evolution of the system. x Grain. Every tool is often suited to specific strategies in achieving a given task. Users are induced by the tool to follow strategies that are apparently easily executable, but that may be non optimal. This is called “grain” in [9], i.e. the tendency to push the users towards certain behaviors. Interactive systems tend to impose their grain to users resolution strategies, a grain often not amenable to user reasoning, and possibly even misleading for them [9]. Because of their different cultural backgrounds, designers and users may adopt different approaches to abstraction, since, for instance they may have different notions about the details that can be abstracted away. Moreover, users reason heuristically rather than algorithmically, using examples and analogies rather than deductive abstract tools, documenting activities, prescriptions, and results through their own developed notations. These notations are not defined according to computer science formalisms but they are concrete and situated in the specific context, in that they are based on icons, symbols and words that resemble and schematise the tools and the entities which are to be operated in the working environment. Such notations emerge from users’ practical experiences in their specific domain of activity [9][14]. They highlight those kinds of information users consider important for achieving their tasks, even at the expense of obscuring other kinds [18] and facilitate the heuristic problem solving strategies, adopted in the specific user community. A system acceptable by its users should have a gentle slope of complexity: this means it avoids big steps in complexity and keeps a reasonable trade-off between ease-of-use and expressiveness. Systems might offer for example different levels of complexities, going from simply setting parameters, to integrating existing components, up to extending the system by programming new components [11]. To feel comfortable, users should work at any time with a system suitable to their specific needs, knowledge, and task to perform. To keep the system easy to learn and easy to work with, a limited number of functionalities should be available at a certain time to the users, those that they really need and are able to understand and use. The system should then evolve with the users, offering them new functionalities when needed. 3 Software Shaping Workshops The aim of the design methodology we are developing is to design multimedia and multimodal environments that support the activities of domain-expert users, with the objective of easing the way these users program and interact with computers. The design methodology is collaborative in that, by recognizing that users are experts in their domain of activity, it requires that representatives of the users collaborate to the development of the system as domain experts, in a team with HCI experts and software experts. The developed environments appear to their users as workshops, providing them with the tools, organized on a bench, that are necessary to accomplish their specific activities. Users work in analogy to artisans, who carry out their work using their real or virtual tools, as it occurs in blacksmith or joiner workshops. For this reason, the computer environments developed with this methodology are called 50 Software Shaping Workshops (SSWs) [6]. SSWs allow users to develop software artifacts without the burden of using a traditional programming language, but using high level visual languages tailored to users' needs. The SSW methodology is aimed at generating software environments, the workshops, in which each user sub-community interacts using a computerized dialect of their traditional languages and virtual tools, which recall the real tools with which users are familiar. In other words, the SSW approach provides each sub-community with a personalized workshop, called application workshop. The application workshops are designed by a design team composed by various experts, who participate to the design using workshops tailored to them. These workshops are called system workshops and are characterized by the fact that they are used to generate or update other workshops. This approach leads to a workshop network that tries to bridge the communicational gap between designers and domain expert users, since all cooperate in developing computer systems customized to the needs of the users communities without requiring them to become skilled programmers. A specific system workshop is the one used by the software engineers to lead the team in developing the other workshops. Each system workshop is exploited to incrementally translate concepts and tools expressed in computer oriented languages into tools expressed in notations that resemble the traditional user notations and therefore understandable and manageable by users. More precisely, experts use a system workshop to create a workshop tailored to a more specialized user. In this way an incremental development is guaranteed, having so a ‘gentle slope’ approach to the design complexity [11]. In fact, the team of designers performs its activity by: a) developing several specialized system workshops tailored to the needs of each designer in the team; and b) using the system workshops to develop the application workshops through incremental prototypes [4][6]. The workshop network depends on the working organization of the user community to which the network is dedicated. Lack of space prevents us from describing in more details how SSWs are generated in real situations. The interested readers may refer to [7], where an application in the field of factory automation is also illustrated. Recognizing the diversity of users calls for the ability to represent a meaning of a concept with different materializations, in accordance with local cultures and the used layouts, sounds, colors, times and to associate to a same materialization a different meaning according, for example, to the context of interaction. To reach this goal, it becomes important to decouple the pictorial representation of data from their computational representation [2]. The XML technologies, which are founded on the same concept of separating materialization of a document from its content, are being extensively exploited. The workshops in the SSW network are implemented as XML documents and a software tool has been developed allowing to create, manage and interact with such documents, whose content is distributed in the Web [4]. Finally, XML is also the technological basis to build the tools to generate the SSW network: each XML document can be steered by its users to self-transform into a new XML document representing a new workshop. On the whole, SSW network is generated from the system workshop of the software engineers by a co-evolutive process determined by the activities of the experts of the design team. 51 4 Conclusions Most end-users are asking for environments in which they can make some ad hoc programming activity related to their tasks and adapt the environments to their emerging new needs. Moreover, several phenomena contribute to the current difficulty of user-system interaction, such as the communicational gap often existing between designers and systems, the user diversity, the co-evolution of systems and users, and the grain imposed by software tools. The methodology discussed in this paper, by taking into account the four mentioned phenomena, is a step toward the development of powerful and flexible environments, with the objective of easing the way end-users interact with computer systems to perform their daily work. References 1. Boehm, B. W., Abts, C., Brown, A.W., Chulani, S., Clark, B.K., Horowitz, E., Madachy, R., Reifer, D.J. and Steece, B. Software Cost Estimation with COCOMO II, Prentice Hall PTR, Upper Saddle River, NJ, 2000. 2. Bottoni, P., Costabile, M.F., Mussio, P. Specification and Dialogue Control of Visual Interaction through Visual Rewriting Systems, ACM TOPLAS 21(6), 1077-1136, 1999. 3. Bourguin, G., Derycke, A., Tarby, J.C. Beyond the Interface: Co-evolution inside Interactive Systems - A Proposal Founded on Activity Theory, Proc. IHM-HCI 2001. 4. Carrara, P., Fogli, D., Fresta, G., Mussio, P. Toward overcoming culture, skill and situation hurdles in human-computer interaction. Int. J. UAIS, 1(4), 288-304, 2002. 5. Carroll, J.M., Rosson, M.B., Deliberated Evolution: Stalking the View Matcher in design space. Human-Computer Interaction 6 (3 and 4), 281-318, 1992. 6. Costabile, M.F., Fogli, D., Fresta, G., Mussio, P., Piccinno, A. Computer Environments for Improving End-User Accessibility. LNCS 2615, 129-140, 2002. 7. Costabile, M.F., Fogli, D., Fresta, G., Mussio, P., Piccinno, A. Building Environments for End-User Development and Tailoring. Proc. IEEE Symposia on Human Centric Computing Languages and Environments, Auckland, New Zealand, October 28-31, 2003, in print. 8. Cypher, A. and Smith, D. KidSim. End user programming of simulations. Proc. ACM Conference on Human Factors in Computing Systems, Denver, Colo., May 7–11, 1995, ACM Press, New York, 27–34, 1995. 9. Dix, A., Finlay, J., Abowd, G., Beale, R. Human Computer Interaction, Prentice Hall, 1998. 10. EUD workshop, CHI 2003 Conference, http://giove.cnuce.cnr.it/chi-eud.html 11. EUD-Net Thematic Network, http://giove.cnuce.cnr.it/eud-net.htm. 12. Fischer, G. Seeding, Evolutionary Growth, and Reseeding: Constructing, Capturing, and Evolving Knowledge in Domain-Oriented Design Environments, ASE 5(4), 447-468, 1998. 13. Lieberman, H. Your Wish Is My Command: Programming by Example. Morgan Kauffman, San Francisco, 2001. 14. Majhew, D.J. Principles and Guideline in Software User Interface Design, Prentice Hall, 1992. 15. Myers, B. Scripting graphical applications by demonstration. Proc. ACM Conference on Human Factors in Computing Systems, Los Angeles, April 18–23, 1998. ACM Press, New York, N.Y., 534–541, 1998. 16. Nielsen, J. Usability Engineering, Academic Press, San Diego, 1993. 17. Perrone, C. and Repenning, A. Graphical rewrite rule analogies: avoiding the inherit or copy & paste reuse dilemma. Proc. IEEE Symposium on Visual Languages, Halifax, Sept. 1–4, 1998, IEEE Computer Society Press, 40–46, 1998. 18. Petre, M., Green, T.R.G. Learning to Read Graphics: Some Evidence that ‘Seeing’ an Information Display is an Acquired Skill. JVLC, 4(1), 55-70, 1993. 52 53 54 55 56 57 58 59 Improving Recommendations by Integrating Collaborative Filtering and Supervised Learning Techniques Marco Degemmis, Stefano Paolo Guida, Pasquale Lops, Giovanni Semeraro and Maria Francesca Costabile Dipartimento di Informatica, Università di Bari, Via Orabona 4, 70125 Bari, Italy {degemmis, guida, lops, semeraro, costabile}@di.uniba.it Abstract. Most Web-based applications with a large variety of users are now unable to satisfy heterogeneous needs. A remedy for the negative effects of the traditional “one-size-fits-all” approach is to improve the system ability to adapt its own behavior to individual users needs. In a previous project funded by EU, we developed a personalization component whose task is to build users’ profiles and provide recommendations about them. The approach is based on the integration of data the system collects about users, both explicitly and implicitly and on hybrid techniques to combine classic collaborative filtering with simple filtering techniques in order to provide appropriate recommendations. In order to improve the performance of our recommendation system, we adopted some heuristics and we integrated a machine learning based system, called Profile Extractor. 1 Introduction The severe competition among Internet-based businesses to acquire new customers and retain existing ones has made Web personalization an indispensable part of ecommerce. In this context advanced features are necessary to tailor the Web experience to a particular user, or set of users. Personalized systems acquire preferences through interactions with users, keep summaries of their preferences in a user model and utilize this model to adapt themselves to generate customized information or behavior. A key issue in the personalization of a Web site is the automatic construction of accurate machineprocessable profiles. Personalization techniques must deal with the problem of exploiting users’ profiles in order to search for, identify and present relevant information to the right user in the right way at the right time. For example, user models have been used in recommender systems for content processing and information filtering: the results returned by retrieval algorithms can be screened based on the preferences of the user. The main advantages of using the one-to-one personalization paradigm based on user profiling are making the site more attractive for users, obtaining trust and confidence, and improving loyalty. The paper is organized as follows. Section 2 describes some aspects of personalization that require the construction of user models. Then, User Profile 60 Engine (UPE), the personalization component that uses filtering techniques, is introduced in Section 3. Section 4 discusses the Profile Extractor (PE), the component that produces recommendations by exploiting history information. The integration between UPE and PE is presented in Section 5. Finally, some issues about the evaluation of recommendations and experimental work are described in the Section 6 and Section 7. 2 Building user models for personalization By user profile we mean all the information collected about a user that logs to a Web site, in order to take into account his or her needs, wishes, and interests. A user profile is a structured representation of the user’s needs, which a retrieval system could exploit in order to autonomously pursue the goals posed by the user. In a user profile modeling process, we have to decide what has to be represented and how this information is effectively represented. Generally the information stored in a user profile can be conceptually categorized in several classes, according to their source, for example registration data, questions & answers (Q&A), legacy data, past history, third party (gathered from marketing databases, demographic analysis, etc.), current activities (which contains the set of actions performed by the customer in the current session). A user profile is given by a list of attributes, each representing a characteristic of that user. These attributes or features can be divided into three categories: a) explicit: whose values are given by the user himself or herself (registration data or Q&A); b) existing: that can be drawn by existing applications (e.g. job); c) implicit: are elicited from the behavior of the user, through the history of his or her navigation or just from the current navigation. A simple approach to acquire user preferences is the manual construction of a user profile: buyers have to fill in a form that asks for personal data and some specific information. In this way, only a limited amount of information can be acquired and personalization service will exploit unreliable or wrong data. For this reason, we adopt an approach that dynamically updates the user model by considering data recorded on past visits to the store (transactions). 3 The personalization component In a previous work, we have developed UPE (User Profile Engine), a recommender system that provides personalized suggestions (recommendations) about pages users might find interesting in a product catalogue on the Web [3]. The approach adopted in UPE is based on the integration of data the system collects about users, both explicitly and implicitly, and filtering techniques, primarily collaborative but also simple filtering in order to provide appropriate recommendations in any circumstances during the visit of the on-line trade fair catalogue [3]. Simple filtering relies on predefined classes of users to determine what content, or generic item, should be displayed or what service should be provided. For 61 example, employees of the research department may have access to some functionality that may not be available for employees of other departments. Collaborative filtering technique consists in collecting user opinions on a set of objects, using rates provided explicitly by the users or implicitly computed, thus forming peer groups, and then exploits the peer groups to predict the interest of a particular user to an item. Some examples of systems incorporating a personalization component based on filtering techniques are in [2], [5], [8]. One of the disadvantages of collaborative filtering method is that it requires a large users database in order to find a peer group for each visitor. This might imply a long learning curve, because at the beginning, when the number of Web site visitors is small, the quality of recommendations will be low. The results improve gradually as the number of users increases. The hybrid approach we have used to integrate simple and collaborative filtering techniques helps in overcoming the disadvantages mentioned above. In particular, if no collaborative recommendation is computed for a particular user, UPE provides him with “simple-filtering-based” recommendations (if available). The rates collected by the system during the user interaction may be both implicit and explicit. Explicit rates are collected when the users tell the system what they think about an item in some form. Even if explicit rates is fairly precise, it has some disadvantages, such as: 1) stopping to enter explicit rates can alter normal patterns of users browsing and reading; 2) unless users perceive that there is a benefit providing the rates, they may stop providing them. Implicit rates is much more difficult to fix, but they have the following advantages: a) every interaction with the system (and every absence of interaction) can contribute to implicit rates; b) the system can gather them for free; c) they can be combined with several types of implicit rates for a more accurate rating; d) combining them with explicit rates is possible to get an enhanced rating. The more effective technique exploits both implicit and explicit rates. Collaborative filtering algorithm developed for UPE uses the combination of the two type of rates and a user correlation measure (Pearson correlation) to predict user interests about items that are not yet evaluated by the user. To improve UPE performance, we have, also, defined some heuristics that reduce the number of users involved in the computation of users’ preferences. UPE recomputes all the user correlations for each couple of users in which at least one of the two users has interacted with the system and has modified a number of rates higher than a certain threshold. UPE also re-computes rates, by collaborative filtering algorithm, only for users that interacts with the system more than a number of times. 4 Learning from transactions Among issues the personalization community must deal with, the following are of special importance: how to provide personal recommendations based on a comprehensive knowledge of who customers are, how they behave and how to extract this knowledge from the available data and store it in user profiles? To address these issues, we have adopted an approach that uses information learned from transactional histories to construct individual profiles. The advantage of using this technique is that profiles generated from a huge number of transactions tend to be statistically reliable. 62 The Profile Extractor (PE) [1], [11] is a personalization module designed according to this approach at the University of Bari. It employs supervised learning techniques to dynamically discover users’ preferences from transactional data recorded during past visits to an e-commerce Web site. The system was tested on a virtual bookstore where the preferences are the main categories the product catalogue is subdivided into. PE builds the profiles containing the product categories the user is interested into. From our point of view, the problem of learning user’s preferences can be cast to the problem of inducing general concepts from examples labeled as members (or nonmembers) of the concepts. In this context, given a finite set of categories of interest C = {c1, c2, …cn}, the task consists in learning the target concept Ti “users interested in the category ci ”. In the training phase, each user represents a positive example of users interested in the categories he or she likes and a negative example of users interested in the categories he or she dislikes. The subset of the instances chosen to train the learning system has to be labeled by a domain expert, that classifies each instance as member or non-member of each category. The training instances are processed by the PE that induces a classification rule set for each category. The rule sets are used to predict whether a user is interested in each category. The profiles inferred by the PE are coarse grained: they contain the categories of interest of a user. Our intention was to enhance the profiles by taking into account the user’s preferences in each category, in order to achieve more precise recommendations. 5 UPE/PE integration To improve performance, different methods have sometimes been combined in hybrid recommenders: it is possible to gain better performance with fewer of the drawbacks of individual techniques. There are several hybridization methods useful to combine different techniques [4]. Our idea is to produce a hybrid method by integrating behavioral profiles inferred by PE and the collaborative method implemented by UPE into one integral approach in an attempt to demonstrate that it outperforms the pure collaborative filtering method. The resulting system, U(PE)2, implements a cascade hybrid method: profiles inferred by PE are exploited to group customers having similar preferences. In our case, preferences are the product categories the customer is interested in. Our idea is that profiles could drive the collaborative method by reducing the set of users, on which the algorithm is applied, only to users interested in the same product categories. PE is applied first to identify distinct groups or “classes” of users. For example users can be grouped as interested in a particular “content category”. Then, the collaborative filtering algorithm is applied to each group of users. In this way, it is possible to improve computational performance by carrying out parallel computation for each group of users. We actually use PE to classify registered users and assign them to the content categories of their interest; we then apply collaborative filtering algorithm to the users of each class, in order to generate recommendations that fit their interests. Several experiments are necessary to effectively define the improvement due to the use of hybrid techniques. 63 6 Evaluating recommendations There are two main aspects related to the evaluation of recommendations: 1) the need of measuring the effectiveness of recommendations in order to make a comparison among different systems and know what is the best one; 2) the analysis of the impact of the recommendations from the point of view of the user. The first aspect is more technical and needs the definition of metrics to measure the effectiveness of a recommender system, or to establish whether a hybrid system is better than one implementing a single technique. Measures adopted are: – Mean Absolute Error (MAE): that is a measure of the deviation of recommendations from their true user-specified values. Between rates and predictions is a widely used metric. – Normalized Distance-based Performance Measure (NDPM), as described in [12]; – Precision (Pr), Recall (Re) and F1, as described in [9]. The second aspect concerns the analysis of the impact of recommendations from the users’ point of view. Specifically in the B2C e-commerce, a user explicitly wants the site to consider his or her information, such as preferences, in order to improve access to relevant product information. In [10] a system is described where user profiles and recommendations are exploited to offer a better support to customers during the search in the products catalogue of an e-commerce site. An experimental evaluation has been performed to compare two versions of the system: one with the personalization component, and the other without. The results show that the data retrieved from a search engine are more specific to the user’ interests when personalization is exploited. Finally, we like to point the attention on how personalization is related to privacy [7]. Specifically in the e-commerce field, privacy is a central concern for personalization: not all data about users may be collected or used. During the development of the Profile Extractor system we tackled the privacy concerns considering only transactional data during the learning process: we didn’t use any personal data. Another fundamental aspect to consider is that system must infer knowledge easy understandable for users: our approach uses rules to describe the model of users’ preferences, and these rules can be viewed and validated by the users themselves. 7 Experimental work We have performed two experiments in order to compare the performance of the proposed hybrid recommender system U(PE)2 with the system UPE, as described, in more details in [6]. For both experiments we used historical browsing data from an Italian e-commerce company. This dataset contains information about 380 users on 154 catalogue products; in particular, it contains explicit rates given by users and implicit rates computed by the system on the basis of the user behavior. Each action performed by a user on a Web page, for example zooming on the picture of a product, corresponds to a rate. Both explicit and implicit rates vary from 0 to 5. In total, the 64 dataset has 9,073 rates for 154 products. The average number of rates per user is approximately 24, and the sparsity of the user rates matrix is approximately 85%. – Experiment 1: evaluation of UPE implementing the classical collaborative filtering technique. The dataset was converted into a user-product matrix that had 380 rows (i.e., 380 users) and 154 columns (i.e., products that were rated by at least one of the users). The experiment was repeated 5 times selecting a different test set (the intersection of the five test sets was empty). – Experiment 2: evaluation of the hybrid system U(PE)2 obtained integrating the behavioural profiles inferred by PE whit the UPE collaborative method. The performance of U(PE)2 has been compared with the UPE personalization system. The dataset was converted into 11 user-product matrixes, each corresponding to a specific product category ci in which PE classified the users. Each matrix had ni rows (i.e., the number of users that PE has classified as interested in the category ci ) and 154 columns (i.e., products that were rated by at least one of the users). In this case, the UPE collaborative filtering was applied separately to each matrix. As experiment 1, experiment 2 was performed 5 times. The most important observation from the results of the first experiment is the high accuracy of the UPE system on the whole dataset in predicting the ranking of the products according to the customers interests (the NDPM value is 0.066). The high value of the F1-measure and the balance between recall and precision demonstrates that the list of recommendations presented to users by UPE contains items correctly ranked. We recorded also the time required to generate predictions for the entire dataset (380x154) and found that the process is computationally very expensive (5h47min on average over the 5 runs). In the second experiment we examined separately the recommendation accuracy for users grouped according to their behavioral profiles computed by the PE. The MAE results are positive given the small number of users belonging to each category (from 35 users in the category “kitchen articles” to the 74 users in the category “underwear”). Also NDPM results are very positive (values do not exceed 0.2), showing a strong correlation between the ranking imposed by the users and the ranking computed by the system, although there is a high degree of variation between different categories. NDPM is better for users strongly correlated and belonging to “more populated” categories: the best values have been founded in the categories “underwear” (74 users) and “hardware”, which show the highest values of user correlation. For the F1 score, we consider the results produced as very positive. Overall, 8 out of the 11 categories reported values that exceed 0.80, while only for one category (“kitchen articles” again) the system was not able to reach a value of at least 0.70. The time requested by U(PE)2 was 57min for computing recommendations and 1h 27min for classifying users into 11 categories. The total time for completing the process was 2h 24min. As regards MAE, the value achieved by UPE is almost five times better than the value we registered for U(PE)2. UPE outperforms U(PE)2 both for NDPM and F1measure. This result is to be expected, as the collaborative filtering algorithm implemented by UPE generates recommendations based on the strength of the association among users and it is adversely affected by reduced training sets containing poorly correlated users. 65 The most important observation from the analysis of the results of the experiment 2 is that the number of neighbors and their correlation have a significant effect on the quality and effectiveness of recommendations: even if two users are interested in the same categories, we are able to produce good recommendations only if there is a strong association between them. When we focus on performance issues, we found the main advantage of grouping users according to their behavioral profiles before computing recommendations: in fact the total time requested by UPE was two times greater the time requested by U(PE)2. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Abbattista, F., Degemmis, M., Licchelli, O., Lops, P., Semeraro, G., and Zambetta, F.: Improving the usability of an e-commerce web site through personalization. In Ricci, F., and Smith, B., (Eds.), Recommendation and Personalization in E-commerce, Proceedings of the Workshop on Recommendation and Personalization in Electronic Commerce, 2nd Int. Conf. on Adaptive Hypermedia and Adaptive Web Based Systems (2002), 20–29. Bueno, D., Ricardo, C., and David, A., A.: METIOREW: an Objective Oriented Content Based and Collaborative Recommending System. In Reich, S., Tzagarakis, M.M., and Debra, P.M.E. (Eds.), Hypermedia: Openness, Structural Awareness, and Adaptivity, LNCS, Vol. 2266, Springer (2002). Buono, P., Costabile, M. F., Guida, S., and Piccinno, A.: Integrating User Data and Collaborative Filtering in a Web Recommendation Systems. In Reich, S., Tzagarakis, M.M., and Debra, P.M.E. (Eds.), Hypermedia: Openness, Structural Awareness, and Adaptivity, LNCS, Vol. 2266, Springer (2002), pp. 315-321. Burke, R.: Hybrid Recommender Systems: Survey and Experiments. In User Modeling and UserAdapted Interaction, Kluwer Academic Publishers, the Netherlands,12 (2002), pp. 331-370. Cotter, P, and Smyth, B.: PTV: Intelligent Personalised TV Guides. In Proceedings of the12th Innovative Applications of Artificial Intelligence (IAAI-2000) Conference, AAAI Press, 2000. Degemmis, M., Lops, P., Semeraro, G., Costabile, M.F., Guida, S.P., Licchelli, O.: Improving Collaborative Recommender Systems by Means of User Profiles. In Karat, C.M., Blom, J., and Karat, J. (Eds.): Designing Personalized User Experiences for eCommerce. Kluwer. (to appear). Kobsa, A.: Tailoring privacy to users’ needs. In Bauer, M., Gmytrasiewicz, P., and Vassileva, J. (Eds.), User Modeling, Lecture Notes in Artificial Intelligence 2109, Springer, Berlin (2001), 303– 313. Kobsa, A., Koenemann, J., and Pohl, W.: Personalized Presentation Hypermedia Techniques for Improving Online Customer Relationship. In The Knowledge Engineering Review 16,2 (2001), pp. 111-155. Sebastiani, F.: Machine Learning in Automated Text Categorization. ACM Computing Surveys, 34 (1), (2002), 1-47. Semeraro, G., Andersen, H.H.K., Andersen, V., Lops, P., and Abbattista, F.: Evaluation and Validation of a Conversational Agent Embodied in a Bookstore. In N. Carbonell and C. Stephanidis (Eds.), Universal Access: Theoretical Perspectives, Practice and Experience, Lecture Notes in Computer Science 2615, Springer, Berlin (2003), 360-371. Semeraro, G., Abbattista, F., Degemmis, M., Licchelli, O., Lops, P., and Zampetta, F.: Agents, Personalisation, and Intelligent Applications. In Corchuelo, R., Cortés, A.R., and Wrembel, R., (Eds.), Technologies Supporting Business Solutions, Part IV: Data Analysis and Knowledge Discovery, Chapter 7, Nova Sciences Books and Journals (2003), 163-186. Yao Y. Y.: Measuring Retrieval Effectiveness Based on User Preference of Documents. Journal of the American Society for Information Science, 46 (2), (1995), pp.133–145. 66 Evidences for a Prototypical Organization of Websites’ Page Layout Francesco Di Nocera, Corinne Capponi, and Fabio Ferlazzo Cognitive Ergonomics Laboratory Department of Psychology, University of Rome “La Sapienza”, Italy {francesco.dinocera, corinne.capponi, fabio.ferlazzo}@uniroma1.it Abstract. The study reported in this paper was aimed at investigating the existence of schemata specifically involved in the cognitive organization of a web page. Particularly, the hypothesis was that the location of some web objects (namely, links to specific contents) might be expected by the users at specific spatial locations. Results confirm that user’s expectations are due to the activity of “low” and “high” level schemata allowing performance optimization. 1 Introduction According to several theoretical accounts, interaction with technological artifacts happens by means of representations or schemata [6] allowing the optimization of our behaviors [1, 9], implying that the human cognitive system selects the most effective strategies for interacting with other people, objects, and (more generally) events. Recently, our laboratory applied this rationale to the field of traffic psychology [3], showing the existence of an optimization criterion for speed selection aimed at minimizing mental workload. This mechanism was related to the activity of expertisebased schemata favoring expert drivers. Since the generality of this approach, also the interaction with web sites could be based on schemata aimed at the optimization of user’s behaviors during navigation tasks. Some of those schemata may be involved in the way people evaluate web sites (see [4] for a model consistent with these ideas), whereas others may underlie the type of actions that are executable within a web site. An additional class of schemata may be involved in the way people look for information either within the entire site or the web page. Such hypothetical schemata might refer to a prototypical organization of the information, and would contain rules and specifications for the location of 1) objects within the page layout, and 2) contents within the site structure. Schemata are hierarchically organized: from general, expertise-dependent schemata to lower-level schemata that are strictly intertwined with structural and functional constraints in the cognitive system. For instance, page scanning often occurs consistently with reading direction, which is culturally-dependent, but not expertise-dependent. Of course, the reading direction schema is not located at the very bottom of the hierarchy. Yet, to our aim, this is still a good level of specification. 67 2 Cognitive GeoConcept (CG) It is commonly accepted that effective design cannot leave out knowledge about the mental models of the user. Hence, schemata involved in human interaction with technology are quite relevant. Any cognitive-based perspective agrees with the idea that if system appearance (the visible structure of the system which acts as a filter between the user and the designer) is not consistent with design (the outcome of the designer’s conceptual model) the user will likely experience a frustrating, or unsuccessful at best, interaction. Understanding how individuals find objects in a web page may favor a design reflecting the type of organization the user expects, and make the sites more accessible, easy to browse, and satisfactory. Yet, the idea that the way people relate to objects deployed in space has a critical role in interface design is not new. Several authors [8, 10] showed that interaction with hypertext (and that applies also to the Internet) is strongly affected by users’ spatial knowledge. Card sorting, for example, is a practice that has been proven to be very useful for organizing informative units into a hierarchical structure, particularly when the amount of information to be delivered is abundant [5]. However, the elements arranged into a single web page should be also deployed in a way suited for optimizing navigation. Unfortunately, research on this issue is scarce. The only empirical study on expected location for web objects is Bernard’s [2]. Regrettably, the research itself was affected by serious methodological problems. For example, users did not interact with a computer, but with a depiction of a browser window, on which they had to arrange pictures of links and banners, according to positions defined a priori by 8 x 7 grid squares. Most importantly, subjects performed the task only once per web object (except for advertisement banners that were two). However, we would like to make it clear that our considerations about the shortcomings of Bernard’s study are only limited to the experimental rigor and by no means to its applied potential. The “paper prototyping” technique [7] the Author used is indeed quite useful and popular among interaction designers. The present study is based on a less explicit method, eliciting users’ responses to verbal labels indicating the to-be-positioned web objects on a large number of trials. We temporarily called this method “Cognitive GeoConcept”, for it is aimed at finding geometrical associations between meaningful objects (links to other pages or functions). According to what we reported above, two different results can be expected from this study: 1. “low level” schemata (spatially and semantically based) would be involved in the process - both experienced and inexperienced users should show the same pattern of arrangement; 2. “high level” schemata (mainly based on navigation experience) would be involved in the process - only experienced users should show an interpretable pattern of arrangement. 68 3 Method Participants Twenty-three students (14 females) volunteered in this experiment. Their mean age was 25.2 years. Thirteen subjects reported to use Internet every day, and were classified as experts. Subjects classified as novices reported to navigate few times per week (5 users) or month (5 users). All users reported being right-handed, with normal or corrected to normal vision, and were naïve as to the purpose of the experiment. Stimuli Fourteen words indicating links to resources often found in web sites (about us, buy, catalog, check your e-mail, contact us, help, home, jobs, news, play & win, register, resources, restricted area, search) were used as stimuli. Procedure Participants sat in front of a 17” comp uter monitor. They had to respond as quickly as possible to the stimuli by clicking on the area of the (blank) screen where they would expect to find the link. Stimuli were presented centrally, white on black, for 200 ms. On any trial the mouse pointer returned to the center of the screen. Fifty repetitions of each stimulus were randomly administered to the subjects. Fig. 1. A graphical representation of the Cognitive GeoConcept procedure: 1) a string indicating a link (i.e. “buy”) is centrally administered on the screen; 2) subjects click on the portion of the screen where they would expect to find that link; 3) another stimulus (i.e. “home”) is presented and a new click is required. Repetitions of the same stimuli are randomly administered making the number of trials very large (up to 700 in the present experiment). 69 Analyses Complete Spatial Randomness (CSR) hypothesis was tested separately for Experts’ and Novices’ clicks distributions using the Nearest Neighbor Index (NNI). Coordinates were then analyzed using Cluster Analysis (Ward’s method). Input distance matrices (for Experts and Novices) were created using average point-to-point Euclidean distances. Positioning responses were further examined using quadrat counts. A 4 x 4 grid was used to divide the area in 16 quadrats. Angular transformations of proportion of clicks within the quadrats were then analyzed using a mixed ANOVA design Expertise (Experts vs. Novices) x Link (contrasting the 14 different links) x Row (1st vs. 2nd vs. 3rd vs. 4th ) x Column (1st vs. 2nd vs. 3rd vs. 4th ). Fig. 2. An imaginary 4 x 4 grid was used for separating the screen in 16 quadrats. Angular transformations of the proportion of clicks within the quadrats were used as dependent variables. 4 Results CSR test showed that the two point distributions were not random. Experts’ NNI was 4.02 (Z=77.99, p<0.01), whereas Novices NNI was 3.28 (Z=51.67, p<0.01). Both results indicated regularity. Cluster Analysis showed different patterns for the two groups. Experts showed five clusters: user input (register, check your e-mail), user commitment (play & win, buy, contact us), company info (news, search, about us, home), corporate identity (product catalog, jobs), and access to resources (resources, restricted area). One link (help) only combined with the others at larger distance. Novices showed six clusters: user input (buy, search, play & win), access to resources (resources, register), corporate identity (catalog, jobs), general functions (home, help), and two not interpretable clusters. One link (check your e-mail) was separated from the other groups. The Analysis of Variance showed a significant Expertise x Link x Rows x Columns third order interaction (F117,2457 =1.63, p<0.01). Thus separate factorial analyses were run for each link. A “Row” main effect was found for the stimuli “check your email”, “registered users”, and “jobs” (F3,63 =2.80, p<0.05; F3,63=2.70, p<0.05; F3,63=3.67, p<0.05, respectively). Duncan testing showed that the individuals’ preference for locating those stimuli significantly increased downward. Furthermore both “registered users” and “jobs” showed a main effect of 70 “Column” (F3,63=4.31, p<0.01; F3,63=3.04, p<0.05, respectively). Duncan testing showed that subjects located both stimuli within the farthest (left and right) columns significantly more often than within the two central columns. Rows by Columns interactions were showed on the proportion of clicks to the stimuli “search” (F9,189 =2.17; p<0.05), “news” (F9,189=2.63; p<0.01), “about us” (F9,189 =2.50; p=0.01), and “home” (F9,189=9.91; p<0.01). High variability of clicks to “search” did not allow any further interpretation of this result. On the contrary, Duncan testing showed that “news” was located in the quadrat defined by the first row and the second column significantly more often than in all the other positions, whereas “about us” was located more often within the first column (with a proportion of clicks increasing from the lower to the upper part of the screen). Finally, “home” was more often assigned to the upper-left corner. An Expertise by Row interaction was found for the stimulus “home” (F3,63=4.84, p<.01). Duncan testing showed that it was due to the novices locating this link also in the lower part of the screen. Three links (buy, help, and resources) showed an Expertise by Row by Column interaction (F9,189=2.49; p<0.01, F9,189=7.32; p<0.01 and F9,189 =2.32; p<0.05, respectively). Particularly, “buy” was mostly assigned to the lower-right corner by experienced users, whereas novices preferred the upper-right area, “help” was assigned to the upper-right corner only by experts, and “resources” was assigned to the upper part of the screen by expert and novices with opposite patterns (leftward for experts vs. rightward for novices). Fig. 3. Horizontal tree diagram summarizing the clustering of links for the experts. Note how easy can be the naming of the clusters in this case. 71 Fig. 4. Horizontal tree diagram summarizing the clustering of links for the novices. Note how difficult can be the naming of the clusters in this case. 5 Discussions and Conclusion This study was aimed at investigating the existence of schemata specifically involved in the cognitive organization of the web page layout. Our hypothesis was that the location of some links could be expected in specific spatial locations. Expectations would be due to the activity of schemata whose aim is to optimize user’s performance. Two possible patterns of results could have been expected, according to the type of schemata involved in the process: either spatially- and semantically-based schemata, or schemata based on navigation experience. Cluster analysis results partially supported the first prediction, as some important clusters were matched for the two groups. However, experts’ were the only really interpretable clusters, whereas novices’ showed at least two not interpretable clusters. We are aware that one of the most important issues affecting users’ performance was the nature of the stimuli we used. Indeed, our study was general in its scope, and stimuli represented links to functions and resources available in different typologies of websites. Using links from one type of website may provide much clearer clusters. Analyses performed on single links provided information extending that obtained from cluster analysis. Expected positions matched the most common among websites (i.e. home, help, etc.), supporting the idea that strategies also affect users’ performance. Our results are thus in contrast with those reported by Bernard [2] who did not find expertise affecting the spatial organization. 72 Of course, this study leaves unanswered many questions about the nature of the schemata discussed above. However, it may be useful in the future to make use of the knowledge gained in this domain to test the efficacy of specific page layouts. For instance, one might evaluate the performance of users interacting with websites organizing space according to the groupings found. Also, extending this testing procedure to specific groups of users (i.e. juniors vs. seniors) would be useful. This may eventually lead to improve our comprehension of the processes involved in human-technology interaction, and to design “objects” that wait for users’ inputs … there where the users’ expect them to be waiting for. References 1. Anderson, J. (1991). The adaptive nature of human categorization. Psychological Review, 98, 409-429. 2. Bernard, M.L. (2001). Developing schemas for the location of common web objects. Proceedings of the Human Factors and Ergonomics Society 45th Annual Meeting, 1, 1161-1165. 3. Couyoumdjian, A., Di Nocera, F., & Ferlazzo, F. (2002). Spontaneous speed: theoretical and applied considerations. In D. de Waard, K.A. Brookhuis, J. Moraal, & A. Toffetti (Eds.), Human Factors in Transportation, Communication, Health, and the Workplace (pp. 175-188). Maastricht: Shaker Publishing. 4. Di Nocera, F., Ferlazzo, & Renzi, P. (1999). Us.E. 1.0: costruzione e validazione di uno strumento in lingua italiana per valutare l'usabilità dei siti Internet. In M.F. Costabile, & F. Paternò (Eds.), Proceedings of HCITALY '99. Pisa: CNR-CNUCE. 5. Maiden, N.A.M., & Hare, M. (1998). Problem domain categories in requirements engineering. International Journal of Human-Computer Studies, 49(3), 281-304. 6. Norman, D. A., & Shallice, T. (1986). Attention to action: willed and automatic control of behaviour. in R.J. Davidson, G.E. Schwartz, & D. Shapiro (Eds.). Consciousness and selfregulation, Vol. 4 (pp. 1-18). New York: Plenum Press. 7. Rettig, M. (1994). Prototyping for tiny fingers. Communications of the ACM, 37(4), 21-27. 8. Rosenfeld, L. & Morville, P. (1998). Information architecture for the World Wide Web. Sebastapool: O’Reilly & Associates, Inc. 9. Tversky, A., & Kahneman, D. (1974). Judgement under uncertainty: heuristics and biases. Science, 185, 1124-1131. 10. Wurman, R.S., & Bradford, P. (1996). Information Architects. New York: Graphis Press. 73 Evaluation methodologies and user involvement in user modeling and adaptive systems Cristina Gena Dipartimento di Informatica, Università di Torino Corso Svizzera 185, 10149 Torino, Italy Email:{[email protected]} Abstract. In this paper I would like to present the main issues and conclusions of my PhD thesis, which has been dedicated to the problem of evaluation and testing of user modeling and adaptive systems. Regular and less explored approaches are taken into account and new perspectives and example are here sketched. 1. Introduction My PhD thesis [4] faced with the issue of evaluation and user involvement in development of user modeling and adaptive systems. During the last few years the international community has underlined the importance of evaluation for a more usercentered approach to these systems [2]. Nevertheless, significant evaluations are not so frequent. Thus, the goal is now making the testing a common practice in the development of such systems and carrying out evaluations in every design phase to report significant results. In this paper I would like to sketch how regular (Section 2) and less explored (Section 3) methodologies can be applied to the evaluation and the development of user modeling and adaptive systems in order to achieve fruitful results. Section 4 concludes the paper. 2. HCI and information selection process evaluations The methodologies for evaluating user-adapted systems are generally borrowed by the methodologies used in HCI and by those ones exploited for the evaluation of the information selection process (mainly derived from evaluation in information retrieval systems). The former ones can be classified in i) collection of user’s opinion, ii) user observation and monitoring, iii) predictive evaluation, iv) formative evaluation, v) summative evaluation. For more details see, [8]. The latter ones encompass metrics such as precision and recall, training set and test set, evaluation of the ordering, coverage, MAE and RMSE, reversal rate and sensitivity measures, etc. For more details see [5]. 74 All these methodologies can be exploited during the different design phases of an user-adapted system following a user-centered design approach. However, the consequences are different from those ones achieved in regular HCI systems. First of all, evaluation methodologies should be applied following a layered approach, since, as described in [1], user-adapted systems need an evaluation that differentiates, at least, problems concerning content adaptation and interface adaptation. For instance, in evaluation of a system that provides personalized tourist information onboard cars [4], we evaluated separately i) the effectiveness of adaptations by calculating the distances between the user choices and the system recommendations and ii) the correctness of interface solutions by means of an usability test. In my thesis I also proposed a layered approach following the three tasks in which Kobsa et al. [6] divide personalized hypermedia applications: i) acquisition method and primary inferences, ii) representation and secondary inferences, iii) adaptation production. For each tasks I proposed the most appropriate methodologies. For details see [4]. This is not a new idea. A similar approach has been already proposed by Paramithys et al. [7]. It is important to underline in this context that, as Paramithys et al. noticed, since the concept of the typical user of a system cannot be applied in adaptive systems, each evaluation has to take into account a particular user having characteristics encoded in some type of user profile and in a particular context of use. Then, I would like to notice that evaluation methodologies of user-adapted systems can be also used as knowledge sources for the development of adaptive applications. Therefore, the exploitation of evaluation techniques lead to a generative approach in the development of these systems. For instance, task analysis can be used to analyze not only the way people perform their jobs, but also how different kinds of users perform their jobs and then modeling the interaction on the basis of these different models. In case of heuristic evaluation, we need to know if an interface works not for a generic user, but how can work for different kind of users. Then, we can design the interface on the basis of the experts’ suggestions about these users. 3. Qualitative approaches The exploitation of qualitative methodologies to the evaluation of user modeling and user-adapted systems is not a new idea, even if quantitative methodologies are largely applied, in particular controlled lab experiments exploiting quantitative metrics. Preece et al. [8] classify the main qualitative methodologies under the umbrella term interpretative evaluation, which can be summed up as “spending time with users”. The interpretative evaluation comes in these flavors: i) contextual inquiry, ii) cooperative and participative evaluation and iii) ethnography. Facing with qualitative evaluation, I gained insightful inspirations from the theory of embodied interaction of Paul Dourish [3]. Dourish defines the embodied interaction as “an interaction with computer systems that occupy our world, a world of physical and social reality, and that exploit this fact in how they interact with us”. While traditional computational model of HCI has been rationally built on a procedural foundation and set out its account of the world in term of plans, procedures, tasks and goals, Dourish’s model of HCI places interaction at center of 75 the picture. By this, he considers interaction not only as what is being done, but also how it is being done. Therefore, in opposition to the dominant cognitive approach to HCI, based to a goal-driven approach to the human-machine interaction, Dourish resumes both sociology (in particular qualitative-based approaches such as ethnomethodology) and phenomenology for an approach to HCI more oriented to the embedded interaction between human and machine. Evaluating systems in a qualitative way requires a fewer number of users then quantitative research. Qualitative researchers sustain that qualitative methodologies allow a deeper knowledge of the subjects involved that compensate the less representative sample. In fact, what this kind of methodologies can offer is a more accurate knowledge of the real behavior of a user sitting in front of an interactive system compared to the artificial situation of lab environment. In case of user-adapted systems, this information can be used i) during the development of a user-adapted system by singling out, for instance, the dimension for modeling the users and ii) for a system revision after the evaluation. The problem concerns the difficulty of modeling an interaction by taking into account a “situated action” instead of a plan predetermined by “search space” of goals and actions. The two perspectives seem to be opposite, but I would like to find some point of contact. In particular, my question is how qualitative and embodied approaches can be applied to user-adapted systems. As Dourish emphasized [3], under a phenomenological point of view, we reach the meaning by acting in the world, so in case of user-machine interaction a subject reaches the knowledge about the system by using the system and interacting with and through it. And also for “the system”, the subject becomes meaningful during the interaction. Another key point of the phenomenological perspective is that experience and interaction come before meaning, while the Cartesian view considers action arising from meaning as the expression of internal mental states. So, the way things are organized shapes our understandings of those things. A logical conclusion of these observations could be constructing the Knowledge Base of a user-adapted system by observing the real user-system interaction, since we cannot predict the user behavior until she experienced the system. Nevertheless, there are some ways to model the user in advance. The user model could be originated, for instance, by the observation of real users interacting with similar systems or with the system to model, if already implemented in a non-adaptive version. Therefore, interpretative techniques such as contextual inquiry, cooperative and participative evaluation and ethnography can be applied to monitor and to have feedback from the user in every design step. To extract relevant user dimensions we could, then, analyze work practices and look for common patterns emerging from different users’ actions. On the basis of existing correlations between users and practices, information on how users understand the system can be then exploited to model the users. To offer personalized recommendations, instead, we could build the system’s knowledge base by monitoring the user choices. Then, we could propose the system’s recommendations to the users and asking them to evaluate such proposals and discussing with them about their choices. Finally, revising the Knowledge Base on the basis of user feedback. 76 Following Dourish’s advices [3], the designer should focus on the ways the user understands the tool and how she uses it in each situation instead of designing the ways to use the artifact. The consequence, in user user-adapted systems, is making the user aware of how adaptation works and adapt the ways in which these facilities are presented on the basis of the user model, since different users understand the tool in different ways. Users having a background in Computer Science have a different approach in understanding how an adaptive system works, compared, for instance, to users having a background in Literature. So, also in this case different suggestions tailored on the basis on user profile are necessary. In conclusion, the key points of qualitative and embodied approaches are i) the importance of user observation in her real context (social, cultural, organizational); ii) gathering field data and studying working settings; iii) the importance of usage studies that point out the unexpected uses of technology that the designers had never intended; iv) the attention to user practices and practices shared in the communities; v) user involvement and user participation in the system design; vi) the link between meaning and experience. 3. Conclusion To conclude this brief excursus, I advocate, of course, the importance of evaluation and testing in every design phase of a user-adapted system. Significant testing results can lead to more appropriate and successful systems. According to my point of view, both quantitative (in term of controlled experiments) and qualitative methodologies of research can offer fruitful contributions. First of all, however, in both cases it is important to correctly carry out the evaluation of the system. In fact, the problem of most evaluations is the nonsignificance of the results and therefore the absence of generalizations. Then, the choice between quantitative and qualitative methodologies depends on the point of view of the evaluation: while quantitative research tries to explain the variance of the dependent variable generated through the manipulation of independent variables (variable-based), in qualitative research the object of study becomes the individual subjects (case-based). Qualitative researchers sustain that a subject cannot be reduced to a sum of variables and therefore a deeper knowledge of a fewer group of subjects is more fruitful than an empirical experiment with a representative sample. The goals of the analyses are also different: while quantitative researchers try to explain the cause-effect relationships, qualitative researchers would like to comprehend the subjects under study by interpreting their point of views. While quantitative research tries to explain why subjects behave in a particular way, qualitative research tries to explain how subjects behave in that particular way. Concerning the methodologies of analysis, quantitative researchers try to validate a theory by falsification, while qualitative researchers try to individuate the so-called ideal types (e.g., the behavioral patterns f different kinds of web users, useful to elicit the more suitable interface) through the description and the classification of the collected empirical data and the individuation of typology dimensions. The ideal types are conceptual categories useful to interpret the reality under observation. In 77 case of user modeling systems, these categories can offer cues to model the features of the users and then adapt the system to these features. Indeed, the choice between quantitative and qualitative methodologies is not trivial and depends on the aims and the purpose of the evaluation. If we would like to test the impact of different adaptation techniques in a given interface, a controlled experiment can produce useful results. For instance, I tested the different impacts of adaptation techniques applied to a web site by means of a controlled experiment with two different conditions (users interacting with the adaptive version of the site vs. user interacting with the non adaptive-version). See [4] for details. Instead, to discover the features of an interaction useful to model the user, observing users in context and interview them can offer material to build the user model categories. For instance, in a project we are carrying out concerning the design of a web site offering technical information at different levels of details (depending on the different background of the user), we are collaborating with domain expert and final users in a iterative design-test-redesign process projects to define the different levels of interaction for given users (e.g., expert, non-expert, etc..). To sum up, if we would like to discover new categories to model an interaction, a qualitative approach can produce more fruitful results, while if we would like to investigate the impact of already known variables, a quantitative approach can be preferred. However, this is not a rule and both approaches can offer interesting results. References 1. P. Brusilovsky, C. Karagiannidis, and D. Sampson, The benefits of layered evaluation of adaptive applications and services, Proceedings of workshop at the Eighth International Conference on User Modeling, UM2001, 2001, 1-8. 2. .D. Chin. Empirical evaluation of user models and user-adapted systems. In User Modeling and User-Adapted Interaction 11 (2001), pp 181-194. 3. P. Dourish, Where The Action Is: The Foundations of Embodied Interaction,MIT Press, 2001. 4. Gena, Evaluation methodologies and user involvement in user modeling and adaptive systems, Phd Thesis, Università di Torino, 2003. 5. N. Good, J. B. Schafer, J. A. Konstan, A. Botchers, B. M. Sarwar, J. L. Herlocker, and J. Ricdl, Combining collaborative filtering with personal agents br better recommendations, Proceedings of the Sixteenth National Conference on Artificial Intelligence, 1999, 439-446. 6. A. Kobsa, J. Koenemann, W. Pohl, Personalized Hypermedia Presentation Techniques for Improving Online Customer Relationships, The Knowledge Engineering Review 16(2), 2001, 111-155. 7. A. Paramythis, A. Totter, and C. Stephanidis, A Modular Approach to the Evaluation of Adaptive User Interfaces, Proceedings of workshop held at the Eighth International Conference on User Modeling in Sonthofen, Germany, July 13th, 2001, 9-24. 8. J. Preece, Y. Rogers, and H. Sharp, Interaction Design: Beyond Human-Computer Interaction, New York, NY:Wiley, 2002. 78 Fluidtime Time Information Interfaces for Public Services Michael Kieslinger Interaction Design Institute Ivrea Abstract. Waiting occurs when our personal time schedules do not coincide with the schedules of the people and services with whom we interact. Because both people and services are in constant flux, precise appointment times are not the most useful means of coordination. When people are provided with continuously updated time information about a service or appointment, the activity of waiting becomes more tolerable. With accurate time information, people can adjust their behaviour accordingly and take control over how they wish to spend their time. The research project described herein investigates a new way of interacting with time, based on the changing availability of the people and services in our environment. The Fluidtime prototype system was created t o provide people with personalised, accurate time-based information directly from the real-time databases of the services they are seeking. This abstract describes the case studies that have been implemented, presents first insights from the trials and discusses the design issues these trials raised. 1 Introduction 79 The examples show a trend towards the use of communication technology for coordinating the timing of services. Providing people with up-to-date information is becoming a more service and is appreciated by the customers. [3] The Fluidtime project aims to contribute to these developments by finding engaging, convenient and effective means to view and interact with real-time information. Due to advances in wireless Internet technology, it is possible to create ubiquitous access to real-time information. Current systems have the drawback that they are not accessible through easy-to-use interfaces or products that the customer can access while at home, in the office or on the move. For instance, travellers first need to go to the train station in order to find out if the train is delayed, or in the case of SMS-based updating systems, any schedule changes are not reflected until the next SMS is sent. If, however, every fluctuation in the schedule were to produce a new SMS message, the recipients could easily be flooded with too many messages. 2 The Development of the Fluidtime Prototype We developed a time information system and interface prototypes to investigate the opportunities and impact of using real time information. The system works by tapping into already existing real-time logistical information from bus companies and laundry services and makes it available to the Fluidtime users via wired and wireless networks. The two contexts that were chosen for the prototype development were the public transport in the city of Turin [1] and the laundry service at the Interaction Design Institute Ivrea. On average 20,000 people use the public transport facilities in Turin every day. Turin transport authorities have already implemented a system that tracks all the buses and trams. The first service prototype makes these data visible to travellers at home, at work or on the move. They can find dynamic information on mobile screenbased devices; while at home or at the office, people can get the same information on mechanical display units. The second service prototype is a scheduling and time monitoring system to help Interaction-Ivrea students schedule their use of shared laundry facilities. The 50 students and researchers share the use of a washing machine. Using different interface modalities, the service performs simple tasks, such as reminding users in the morning to bring their laundry to the Institute, or letting them know when their laundry slot is ready or their washing is done. 3 Interfaces The interface design challenge was to create a simple and effective system of interactions. The intrinsic problem with time planning systems is that they require time to be used. On the one hand, they help us free up our time or organise our 80 activities in a better way, but on the other, they require time to be operated, thus reducing the overall effectiveness. We developed two types of interfaces: one that is meant to be mobile and accessed anytime and anywhere, and a second that is stationary, and designed to be used in the context of the home or office. It is worth mentioning that the second type, the physical object interfaces, are mainly developed as an exploration of the quality of interaction and information representation. We do not see them as proposals for products that should be built and go to market tomorrow. They explore some basic functionality and quality. Alternatively, using a generic mobile phone allowed us to explore interfaces that are on the market now and would not require special investment from customers. 3.1 Mobile interfaces The interface is based on a Java software application that runs on a standard mobile phone, connects to a server to get the real-time information and then visualises the data. We also created a wristband that allowed test users to wear the phone interface on the lower arm, just like a regular watch. Once the application was activated, it allowed them to check any changes of time information just by looking at the display, since the application was always on and always connected. FI1 (Fluidtime interface 1): Perspective visualization: The interface shows how far away a certain pre-selected bus is from the chosen stop. (See Figure 1) The application permanently updates the visualisation with data originating from the Turin transport authorities. FI2: Iconic representation of time: An icon on the upper part of the screen indicates the state the user should be in order to catch the next bus. (See Figure 1) If the icon displays a tranquil character, the user can be relaxed. If the icon is a running figure, the user knows that the bus is due to arrive. FI3: Overview of three routes and stops: With this interface, the user has increased planning control. (See Figure 1) It allows the user to define up to three different routes at up to three different stops. This information is necessary for travellers that need to change buses. Test users also relied on it to decide which bus stop to walk to. FI4: Washing status indicator: This interface shows the status of the washing machine, informing the user when it is the right time to go to the laundry room to unload the machine (See Figure 1). Figure 1. Fluidtime interface 1,2,3 and 4 (FI1, FI2, FI3, FI4) 81 3.2 Physical object interfaces FI6: Mechanical display unit with icons: This is an object for the transport context. It has the dimensions of a small hi-fi stereo (See Figure 2) and is meant to be placed in the home or office. Through the glass fronts of the object, the users can see small iconic representations of the buses that move from the background to the right hand corner in the front. The position of the miniature bus tells the users how far the bus is from the bus stop. FI7: Mechanical display unit with shoes: This object looks just like a small shoe rack that people keep in their homes. (See Figure 2) When the user activates the object, the movement of the shoes indicates the distance of the actual bus. If the shoes move slowly the bus is still far away, and the user could walk slowly and still catch the bus. If the pairs of shoes start to "run" then the user would also need to run in order to catch the bus. Since the moving of the shoes creates an acoustic pattern, the user can listen to the information even if he is not in the same room as the object. FI8: Mechanical display with ribbons: This wall-mounted object has a discreet appearance and indicates the status of the washing machine. (See Figure 2) The turning angle of the central cube indicates the progress of the washing machine. Once the washing cycle has finished ribbons will appear and clearly indicate that is time to pick up the laundry. As soon as the washing machine door is opened the object turns back to its initial state. Figure 2. Fluidtime interface 6, 7 and 8 (FI6, FI7, FI8) 4 The Trials The interfaces for mobile phones (FI1 - FI3) described above were tested in Turin between May and June 2003. Four candidates in their early thirties were given a mobile phone including the applications and two of them received a wristband for optional use. A fifth user received the interface FI7 to test. Since the applications were quite simple, the users required not much learning. All candidates were commuters that used buses everyday to travel to and from work. The candidates were interviewed three times during a trial period. These interviews aimed to capture how they used the prototypes, the functional value of the interfaces, the usability and aesthetic quality, and the emotional and social attitudes of the test candidates Interfaces were consulted on a daily basis. Users interacted with it either on their way to the bus stop or once they arrived. Only in cases where the apartment or office was close to the bus stop, they started the application before leaving home or office. If people could estimate the time it would take them to the stop (e.g. using elevator), they only started the application once out on the street. 82 Over time, users gained experience in estimating their timings. One of the users adjusted her "leaving the office" routine over time. She would start the application and if the bus was still distant, surf the web or chat with colleagues, until the bus was due to arrive. The application used by the trial candidates did not allow the storage of frequently used bus routes and stops. This turned out to be the biggest handicap for adoption. As mentioned in the introduction of the interfaces section, if time applications take too much time to operate, the value gained by the time information does not equal the effort to access the information. A second usability handicap was the fact that the application can only be started from within the application menu of the mobile phone. Again, this effort is too big to be useful on a frequent or daily basis. On a social, psychological level, the team found that real-time services not only support those who like to plan ahead and want to compare different route possibilities in order to save time or be more efficient, but also gives people less inclined to plan more possibilities to seize the moment. This supported our hypothesis that time information devices do not necessarily save time. It all depends on how the person uses it. In the case of Fluidtime, the aim is to give people more control over time -it is the user's choice how to deal with the information. 5 Conclusions The aim of the project was to develop a prototype infrastructure and a set of interfaces that allowed users to access real-time information in the context of everyday life (commuting and doing the laundry). It was important to provide fully functional prototypes to the test users in their particular everyday life contexts in order to study the direct influence of this new technology on their daily habits and routines. In ubiquitous computing environments, the flow of everyday interaction has to be as smooth as possible. The value gained by new applications is often not equal to the effort put into learning and using them. Mobile interfaces are difficult to test in a context of everyday life, since many factors can influence the reseach results. Nevertheless, we found it particularly helpful to spend time with the users while the used the system on the streets in their everyday environment. References 1. 5T consortium Turin: http://www.5t-torino.it/index_en.html 2. Barth, J. Fluidtime survey: competitive research examples, http://www.fluidtime.net/download/Fluidtime_survey.pdf, 2002 3. Brardinoni, M. Telematic Technologies for trafic and transport in Turin, www.matisse2000.org/Esempi.nsf/0/45bb8686703c9e3ac12566a40036100e/ $FILE/5T.doc 4. Kreitzman, L. The 24 hour society. Profile Books, London, 1999 83 UPoet: poetic interaction with a 3D agent Lamarca M., Zambetta F., De Felice F., Abbattista F. Dipartimento di Informatica, Università degli Studi di Bari [email protected], [email protected], [email protected], [email protected] Abstract. In this paper we introduce UPoet, also known as the Ubiquitous Poet, the prototype of a 3D virtual character resembling a hieratical zen monk, which realizes a poetic interaction with the user. UPoet facial animations are used to express UPoet reactions to users requests, and to convey its feelings whilst the text of the poetry is being generated. 1 Introduction Creativity is probably the most noticeable feature of our mind. Artificial creative systems could be effective in domains where conventional problem solving is unlikely to produce optimal solutions. This is a multidisciplinary area which requires contributions from a wide set of other areas. We do not know any up-to-date system sharing the same goals and the same set of features UPoet has, but a lot of research was done both into the 3D intelligent agents field and in the text-generation one. REA is a very complete ECA (Embodied Conversational Agent) [1], a Real Estate Agent able to converse with the users and sell them a house with regard to their wishes and needs. The EMBASSI system [2] implements an intelligent shop assistant to facilitate user purchasing and information retrieval. Among literary systems we find Poevolve [3], from “poetry” and “evolve”, for it is an attempt to generate limericks using evolutionary algorithms. Another system is McGonagall, developed by Manurung [4], in which poetry generation is considered as an evolutionary exploration of all possible texts. Unluckily, these last systems lack of examples. Section 2 gives an overview of the system, Section 3 details the client side of the UPoet application whilst Section 4 explains the server side. Section 5 shows our experimental results while future work is drawn out in Section 6. 2 The UPoet System Architecture Our UPoet application integrates the YODA generative module with a part of the S.A.M.I.R. (Scenographic Agents Mimic Intelligent Reasoning) [5] system: A 3D intelligent character acts as an animated front-end to the haiku (a Japanese poetic composition [6]) generated by the YODA module. 84 Fig. 1. The UPoet Architecture A general picture of the application is given in Figure 1: It is split between a client, a Java applet, enclosing the animated 3D face and an interface to let users chat with the system (the DMS client) and a small HTTP layer to ensure synchronized communication with the server. The user can chat with UPoet, receiving visual (i.e. animation of the 3D face) and textual responses but he can also request the generation of a haiku. This request passes to the server which will try, using the YODA module, to generate a context dependant haiku. A slightly modified version of the Alice chatterbot [7] is used to give responses to users when not in text-generation mode. This module represents the core of the server side part of the Dialogue Management System. 3 The UPoet Client The UPoet client is composed by two modules: FACE (Facial Animation Compact Engine) and the DMS (Dialogue Management System), both hosted in a common Java applet, which is responsible for storing the client state, sending user requests to the server and synchronizing the haiku verses received from the server with facial animations (one expression for each verse). The FACE (Facial Animation Compact Engine) Animation System is devoted to generating 3D facial animations on the web [8]. It is implemented in Java and uses the Shout3D API [9]. 4 The UPoet Server The DMS (Dialogue Management System) is responsible for the management of user dialogues and for the extraction of the necessary information for giving textual responses to any user [8]. 85 Its main task is retrieving specific information, based on the responses elaborated on the server-side by the ALICE Server Engine [7], a Java Servlet enclosing all the knowledge and the core system services to process user input. The central module of the server-side of UPoet is YODA, a prototype of creative and interactive literary system [10]. Actually, YODA is able to to generate haiku, a short Japanese poetic form, which evocates images from natural events, pertaining a specific season. Usually, in English language a haiku is made up of 17 syllables, distributed on 3 verses. YODA is interactive in the sense that the user gives suggestions to generate the haiku. Suggestions are given in terms of words, related or not to seasons. YODA is creative in the sense that elaborates “creatively” these suggestions, in order to use in the haiku a set of words semantically related to the suggestions, making the user feel that YODA has taken into account his suggestions. YODA is constituted by several components. We have a lexical database of English language, WordNet [11], whose great advantage over other lexical databases and dictionaries is the possibility to go across large semantic networks. The Text Planner concerns the semantics. YODA chooses the words to use following the user’s suggestions. The heuristics guiding the process of text planning exploit the structure of WordNet. The Surface Realizer takes care of the morphological and syntactical aspects of text generation. YODA is able to obtain agreement during the process of generation of the text, using words chosen by the Text Planner component. To summarize the process of haiku generation, we explain the steps involved: – the user inserts one or more words, taken by YODA as suggestions; – for each term of this set, it is determined the suitable part(s)-of-speech; – from this set are obtained (via WordNet) the terms effectively being used for the generation: rather than use terms inserted by the user, YODA selects (for example) synonyms; if a term is related to a particular season, then selection is guided by the appropriate season; – if the number of words found is not yet enough for generation, then some other words are chosen randomly; – eventually, the surface realizer generates sentences that take into account user’s suggestions, while fitting haiku structure. 4.1 Haiku and Facial Expressions A haiku is a way to share feelings: UPoet shares emotions with the user by means of haiku and facial expressions. The 3D agent changes its facial expression according to the content of the haiku generated by YODA. For the realization of this feature, we follow the theory of Ekman and Friesen [12] on facial expression. This theory claims that every facial expression is given by a combination of six fundamental expressions, corresponding to six emotions: anger, disgust, fear, happiness, sadness, and surprise. 86 Fig. 2. A view of two conversations with the UPoet, culminating in the haiku generation UPoet manages the mapping between text and expression our 3D facial animation engine (F.A.C.E.) and our natural language generation system (YODA), synchronizing text and expression on the client-side of the application. The mechanism used for the mapping from words to expression is based on a metric distance defined in terms of semantic relationships in WordNet. Distances are calculated between words used in the haiku and words corresponding to the six emotions. Then, a corresponding facial expression is generated for every verse of the haiku. 5 Experimental Results In Figure 2 are depicted some typical interactions. We also report a small piece of conversation. UPoet> I’m the Ubiquitous Poet. I compose haiku inspired by your suggestions and tastes. user> What is a haiku? UPoet> Haiku is a short Japanese poem. It originated in Japan during 17th century as an evolution of the first initial verses of a collective poem called renga. The main 87 subject of a haiku is nature. Some classical themes are astronomy, climate, animals, plants, geography, life, and observances. user> I think rage is wrong. UPoet> Please tell me the correct answer. Aggrieved injustice. Violated violation acted false injustices. 6 Future Work Our current major concerns about the client side of UPoet. We would like to: Enhance YODA text generation mechanics using evolutionary learning techniques, for example genetic algorithms or genetic programming. Improve the variability of the syntax, adding more tenses and introducing the use of phrasal verbs. Generate other kind of poetic compositions other than haiku (i.e. Western forms such but also modern styles). Integrate a TTS (Text-To-Speech) system into our client so that the Ubiquitous Poet could dramatically play its compositions, in a more human-like sense of the term. It should result in a way more interesting application and it should target a wider audience of different kind of cultures and social extraction. References 1. Cassell, J. Sullivan, J., Prevost, S., Churchill, E., eds.: Embodied Conversational Agents. MIT Press, Cambridge (2000) 2. Jalali-Sohi, M., Baskaya, F.: A multimodal shopping assistant for home ecommerce. In Rosen, R., Snell, F.M., eds.: Proceedings of the 14th Int’l FLAIRS Conf., Key West (2001) 2–6 3. Levy, R.P.: A computational model of poetic creativity with neural network as measure of adaptive fitness. In: Proceedings of the Workshop Program at the Fourth International Conference on Case-Based Reasoning 2001. (2001) 4. Manurung, H.M.: Mcgonagall: The evolutionary poet. 1st prize, Poster competition, the Informatics Jamboree, 23–25 May 2001 (2001) 5. Zambetta, F., C.G.: Designing not-so-dull virtual dolls. In: Soft Computing Systems. Design, Management and Applications. Volume 87 of Frontiers in Artificial Intelligence and Applications., Amsterdam, IOS Press (2002) 513–518 6. Higginson, W.J.: The Haiku Handbook. Kodansha International, Tokyo (1989) 7. ALICE AI Foundation: (http://www.alicebot.org) 8. Zambetta, F., C.G., F., A.: Cindy: a 3d virtual entertainer. AISB Journal (2003) 291–302 9. The Shout3D API: (http://www.eyematic.com/products shout3d.html) 10. Lamarca, M.: YODA: un prototipo di sistema creativo letterario. Graduation thesis, Università degli Studi di Bari, Dipartimento di Informatica (2003) 11. The WordNet Website: (http://www.cogsci.princeton.edu/˜wn) 12. Ekman, P., Friesen, V.W.: Facial Action Coding System. Consulting Psychologists Press, Palo Alto, CA (1978) 88 # $ % & ' + , = . - . ! = ! 3 < ' $ = % , > + = * 6 < ) ; ( 2 & 7 , 0 4 + , 4 " 1 + , = / 5 7 > 2 4 = + 9 & 4 4 $ 2 , 1 ! 1 + % + $ / 8 , ! 1 + 3 " , : 3 " ? E ? L K B ? M N D N O P F O N F R A Q A S @ L ? P T U E ? A L D L P V B U ? O " N O 4 89 @ U P B W ? @ @ @ A G ? B " X B C B H I E " ? A " A J " ! ! ! 90 91 92 $ 6 8 8 8 9 8 ) , 8 1 " & 0 5 " & & * 2 " " $ & - " ! " " # ! + - ) 0 % + # 7 1 & 8 & 8 & ; & & : & , ! 7 , ! , 4 ( / + + ( 2 9 3 9 ( / 5 ' 1 & . % # ( 93 8 & Modalità di percezione ed esplorazione degli spazi web dedicati alle arti visive Visual arts web spaces perception and exploration styles (EN) Paolo Raviolo Dip. di Scienze Umane e dell’Educazione – Fac. di Lettere e Filosofia II – Università di Siena – Viale Cittadini, loc. Il Pionta, 52100, Arezzo (IT) [email protected] http://www.unisi.it/IRF/ Abstract (IT). In questo lavoro si è fatto ricorso a un approccio etnografico per analizzare le aspettative e le modalità di percezione di un campione di utenti nei confronti degli spazi web destinati alle arti visive, in particolare rispetto alla percezione e all’uso della comunicazione testuale vs. quella visiva. È stato utilizzato un questionario web disponibile in italiano e inglese, articolato in 28 domande assimilabili a sei temi: dati demografici, esperienza nell’uso del web, competenza in ambito artistico, stile di navigazione sul web, concezione del ciberspazio e opposizione arte visiva virtuale vs. reale. La somministrazione ha riguardato circa 250 soggetti tra Ottobre 2001 e Febbraio 2002. L’indirizzo del questionario è stato presentato attraverso un messaggio ad alcune liste di discussione internazionali, inoltre lo stesso messaggio è stato inviato ad una selezione di indirizzi e-mail scelti tra i partecipanti a due conferenze internazionali: Electronic Imaging & the Visual Arts – EVA 2001, svoltasi a Firenze e International Cultural Heritage Informatics Meeting 2001, ICHIM01, svoltasi a Milano. Analizzando i risultati del questionario è stato possibile disegnare alcuni profili di utilizzo del web come medium per le arti visive: la dimensione testuale dell’informazione sembra rimanere rilevante per tutti gli utenti, la ricerca delle informazioni avviene spesso in modo prevalentemente testuale, l’interesse per la dimensione visiva emerge nella fase preliminare di navigazione tra le pagine del sito, quando non si dispone di riferimenti testuali in grado di orientare la ricerca, e nella fase conclusiva, quando l’oggetto di interesse non è testuale. Abstract (EN). In this paper we present an ethnographic approach to the analysis of user expectations and perceptions of web spaces devoted to visual art, it considers specifically the different attitude toward textual vs. visual communication. It was used a web questionnaire, available both in Italian and English, consisting in 28 questions structured in 6 main topics: demographic data, web use proficiency, experiences in art fields, web navigation habits, cyberspace conception, attitude towards virtual vs. real art exhibition. The questionnaire was submitted to about 250 people between October 2001 and February 2002. The questionnaire address was posted to a number of international mailinglists and mailed to a number of the following conferences participants: Electronic Imaging & the Visual Arts 2001 (EVA 2001), Florence 94 and International Cultural Heritage Informatics Meeting 2001 (ICHIM01), Milan. The questionnaire outcomes reveal a relation between user models and the use of the web as medium for visual arts. The textual form seems to be the most relevant for users since this form is the most used for searching the web. The visual form becomes useful in the first approach to the site pages, in particular when users do not have textual keys, and in the final stage of the navigation, when the target is not textual like a picture or simila. 1. Introduzione L'obiettivo di questo lavoro è la messa a punto di una metodologia per l'analisi di informazioni di natura qualitativa sulla percezione del web come spazio virtuale e sulle aspettative nei confronti delle arti visive presentate attraverso questo medium. In particolare l’analisi si sviluppa in rapporto a due dimensioni: l’esperienza degli utenti nell’uso del web e la loro competenza nell’ambito delle arti, in particolare delle arti visive [1]. Allo scopo di costruire un quadro di riferimento che renda conto di questa complessità si è fatto ricorso ad un approccio etnografico facendo riferimento alla metodologia proposta da James P. Spradley [2] per la realizzazione di un questionario on-line da somministrare a soggetti con un interesse presumibilmente medio-alto nei confronti delle arti visive e/o nella tecnologia web. I risultati sono stati analizzati in primo luogo dal punto di vista quantitativo, osservando la distribuzione del campione rispetto ad ogni singola domanda. Il secondo passo è consistito nell'incrociare le risposte dello stesso soggetto a diverse domande per cercare di generalizzare alcuni profili utente sulla scorta dei principi di user modelling proposti da Brusilovsky [3] e quindi produrre delle astrazioni circa i comportamenti del campione in esame. 2. Questionario Il questionario è articolato in 28 domande assimilabili a sei temi: dati demografici, esperienza nell’uso del web, competenza in ambito artistico, stile di navigazione sul web, concezione del ciberspazio in riferimento ad una tassonomia basata sul campo semantico [4] secondo la classificazione proposta da Chris Hutchison [5] e opposizione arte visiva virtuale vs. reale. Il questionario è stato somministrato a circa 250 soggetti tra Ottobre 2001 e Febbraio 2002, tutti i dati sono stati raccolti attraverso la pagina web del questionario. Il questionario è stato pubblicato in italiano e in inglese, l’utente poteva cambiare la lingua del questionario prima della compilazione. L’indirizzo del questionario è stato presentato attraverso un messaggio ad alcune liste di discussione internazionali, inoltre lo stesso messaggio è stato inviato ad una selezione di indirizzi e-mail scelti tra i partecipanti a due conferenze internazionali: Electronic Imaging & the Visual Arts – EVA 2001, svoltasi a Firenze e International Cultural Heritage Informatics Meeting 2001, ICHIM01, svoltasi a Milano. 95 3. Risultati L’età dei soggetti che hanno completato il questionario va da 20 a 71 anni, il 50% dei soggetti è compreso tra 27 e 31 anni, la media è di 34 anni. Il 52% sono femmine, mentre il 48% del campione è costituito da maschi, il 67% ha utilizzato il questionario in lingua italiana e il restante 34% quello in lingua inglese. Sono state dichiarate 19 differenti nazionalità, quelle maggiormente rappresentate sono: italiana con il 63,4%, statunitense con il 18,6% e inglese con il 4%. 3.1. Formazione Dal punto di vista della formazione il 2,5% degli utenti che hanno compilato il questionario ha dichiarato di avere conseguito un certificato di formazione professionale (Technical qualification o G.S.C.E.), il 20,5% un diploma di scuola media superiore (A level o High Scool certificate), il 13% una laurea in ambito scientifico, il 28% una laurea in ambito umanistico, il 31% dichiara di avere conseguito un dottorato di ricerca o comunque un attestato di studio postuniversitario. 3.2. Occupazione ed esperienze in campo artistico Complessivamente il 28% dei soggetti dichiara di lavorare in un settore collegato con l’informatica, il 13% in ambito educativo e circa il 18% nella ricerca pura o applicata, il 9% dichiara di essere studente. Il 62% del campione afferma di essere coinvolto in attività artistiche, in particolare: multimedia 28%, arti visive 27%, grafica 21%, musica 10%, design 5%, danza 1%. Si è scelto un metodo empirico come il questionario per ottenere ulteriori indicazioni di natura qualitativa sulle aspettative e sulle modalità di percezione degli utenti in relazione agli spazi web dedicati alle arti. Il questionario è stato somministrato a soggetti esperti nell’ambito del web e/o delle arti visive. In base ai risultati della domanda n. 8, più del 97 % degli utenti dichiara di utilizzare abitualmente il web e l’e-mail, più del 48 % di fare uso dei newsgroup, il fatto che il 45,3 % degli utenti dichiari di utilizzare il protocollo FTP (File Transfer Protocol) e, il 18,2 % il servizio Telnet, indica che molti degli utenti hanno competenze anche relative alla realizzazione e al mantenimento di siti web. Un’elevata percentuale di utenti è costituita da esperti, lo conferma il fatto che solo il 31,3% dichiara di non avere mai creato una pagina web. Il 30 % dichiara che questa attività fa parte del proprio lavoro mentre complessivamente quasi il 35 % afferma di creare pagine web occasionalmente o spesso. Il campione di utenti è significativo anche dal punto di vista delle competenze in ambito artistico: il 62,7 % degli utenti afferma di essere coinvolto in attività artistiche. Tra essi complessivamente il 76,8 % è coinvolto in forme d‘arte visive (arti visive, multimedia, grafica), mentre il 4,9 % si occupa di design. L’interesse per le arti visive è confermato dal comportamento dichiarato nei confronti delle visite ai musei e ai siti web il cui argomento sia l’arte. Circa il 50 % degli utenti afferma di aver visitato da 5 a 20 musei o mostre d’arte e altrettanti siti web dedicati alle arti negli ultimi 6 mesi. 96 3.3. Stile di navigazione Analizzando lo stile di navigazione del campione emerge un atteggiamento finalizzato alla ricerca e localizzazione di informazioni, privilegiando il testo scritto nel 48,1 % dei casi su altre forme, le immagini sono considerate un tipo di informazione preferita dal 32,9 %, mentre il video, il suono e gli ambienti di sintesi sono considerati secondari, il fenomeno è da mettere in relazione ai problemi che pone la scarsa diffusione di collegamenti ad alta velocità. L’informazione testuale è privilegiata anche nella fase di ricerca, quasi il 70 % degli utenti, infatti, dichiara di utilizzare prevalentemente motori di ricerca anziché directory. I motori di ricerca infatti effettuano la ricerca esclusivamente sull’informazione testuale, mentre le directory organizzano i link in categorie all’interno di cui la ricerca è sempre testuale ma non più articolata attorno ad un singolo termine. Un altro indicatore del comportamento degli utenti nei confronti dei contenuti online è l’atteggiamento in rapporto ai plug-in, piccoli software che sono installati sul personal computer per visualizzare elementi multimediali come ambienti tridimensionali e simili. Poiché scaricare e installare un software sul proprio computer è un’operazione che richiede tempo ed espone l’utente a potenziali malfunzionamenti, questa operazione è stata considerata indicativa dell’interesse verso un elevato grado di interattività. Più del 70 % degli utenti afferma di installare un plug-in “solo se molto interessato al contenuto del sito”, solo il 7,8 % afferma di installare sempre i plug-in se il sito che si visita lo richiede. Poiché i plug-in sono utili quasi esclusivamente per visualizzare materiali visivi, generalmente diversi dalle immagini statiche, si può dedurre che gli utenti sono disposti a scaricare questi software quando ritengono interessante il contenuto di un sito che non sia un testo o un’immagine, quindi approssimativamente nel 12 % dei casi. Se dal punto di vista dei contenuti gli utenti considerano interessanti principalmente il testo e l’immagine statica, quando si passa ad analizzare testi e immagini utilizzati come strumento di navigazione, con la funzione di link ipertestuali, diventa più chiara la funzione che gli utenti stessi attribuiscono ai segni. Di fronte alla domanda: “Con quale tipo di link si trova più a suo agio in un sito web?”, tra link di natura testuale e link rappresentati da immagini, più della metà degli utenti afferma che la preferenza è relativa al tipo di collegamento. Se si indaga sulle motivazioni della preferenza di un certo tipo di link rispetto ad un altro emerge una differenziazione marcata nella funzione attribuita al testo e all’immagine. L’immagine ha un valore legato sostanzialmente all’immediatezza: è considerata rapida da comprendere dal 43 % degli utenti e facile da individuare dal 30,2%; il testo invece è preferito in quanto analitico: il 61,5 % degli utenti sceglie un link testuale perché contiene esattamente ciò che sta cercando, il 16,7 % perché contiene dettagli sulle pagine seguenti. L’immagine guida quindi la ricerca dell’utente attraverso gli elementi iconici durante l’esplorazione visiva della pagina web, mentre il testo è ritenuto utile quando l’utente ha un obiettivo che può essere espresso in forma testuale, attraverso una o più parole chiave. Mentre il tipo di link si riferisce alla navigazione tra le pagine web, ad un livello più alto è possibile analizzare le aspettative dell’utente nei confronti dell’organizzazione di contenuti all’interno del sito. Nel contesto della visita a una 97 mostra d’arte reale circa il 36 % degli utenti afferma di seguire un proprio percorso personale diverso da quello suggerito dalla disposizione delle opere nello spazio espositivo, il 72,8 % degli utenti afferma che nel sito web di un museo le opere dovrebbero essere proposte con un’organizzazione diversa da quella esistente all’interno del museo. I principali vantaggi di una diversa organizzazione delle opere appartenenti al museo stanno, secondo 55,6 % degli utenti, nella possibilità di fornire maggiore informazione di contesto e presentare l’opera da differenti punti di vista (riferimenti storici, tecniche di realizzazione, materiali, collocazione originale), mentre il 36,4 % degli utenti indica l’opportunità di offrire al visitatore un tipo di informazione solitamente non disponibile nella visita ad un museo virtuale, come ad esempio l’ingrandimento di un particolare o il retro di una tela. È possibile visitare una collezione di opere d’arte esclusivamente attraverso il web? In questo caso gli utenti si sono distribuiti in modo uniforme tra favorevoli e contrari: il 46 % risponde in modo affermativo, mentre il 52,8 % lo ritiene impossibile, approfondendo l’analisi con metodi statistici è possibile, come vedremo più avanti, costruire un profilo degli utenti che condividono l’una o l’altra posizione. Tra gli utenti favorevoli alla visita esclusivamente virtuale alle collezioni solo l’11,4 % ritiene che essa possa sostituire l’esposizione di opere che non sono visibili al pubblico e il 12,5 % attribuisce alla dimensione virtuale l’assenza di vincoli di orario. La maggioranza degli utenti ritiene che i principali vantaggi dell’esposizione virtuale delle opere risieda nella possibilità di ricostruire un contesto storico e critico di riferimento, il 20,5 %, e nella personalizzazione dell’informazione possibile attraverso un ambiente virtuale, vantaggio indicato dal 30,7 % degli utenti. Più del 50 % degli utenti che ritengono possibile visitare una collezione di opere d’arte in modo virtuale attribuisce questa scelta alle caratteristiche intrinseche del media, come l’interattività e la personalizzazione dell’interfaccia, piuttosto che alla possibilità di superare attraverso di esso dei vincoli di natura fisica come la distanza, gli orari di visita o la non esposizione al pubblico delle opere. 4. Ricerca delle relazioni La prima analisi dei dati ha fornito alcune indicazioni quantitative sui risultati. Evidenziando le tendenze degli utenti rispetto a ciascuna domanda, è possibile approfondire l’analisi, stabilendo delle relazioni tra variabili, ossia ponendo in relazione le risposte date dagli stessi utenti a domande differenti. Individuare queste relazioni consente di costruire dei profili utente in relazione alle loro aspettative nei confronti della rappresentazione delle arti visive sul web. Premessa necessaria è che i risultati presentano limiti intrinseci al metodo statistico, al numero di casi esaminati ed evidenziano una sovrarappresentazione della cultura italiana e anglosassone rispetto ad altre aree culturali. È tuttavia possibile trarre dai dati raccolti alcune indicazioni in parte intuitive e che possono, contribuire a formare un contesto utile allo studio delle comunicazione delle arti visive e del rapporto tra la dimensione visiva e quella testuale nell’uso del medium web. 98 L’esame delle relazioni è stata effettuata attraverso l’analisi bivariata, il che consiste nel tenere conto contemporaneamente di due variabili per ciascuna unità studiata, in questo caso le unità sono gli utenti mentre le variabili sono le risposte alle domande poste nel questionario. L’analisi e stata condotta trasformando i valori assoluti in percentuali su una tabella a doppia entrata [6]. Le tabelle a doppia entrata sono infatti uno strumento per rilevare la relazione tra due variabili, studiare questa relazione significa infatti cercare un ordine interno ai dati. Il metodo per individuare una relazione tra due variabili consiste nell’escludere che vi sia indipendenza, l’intensità della forza della relazione è variabile. All’interno della tabella è possibile rilevare una relazione confrontando i valori percentuali, separati in relazione alla variabile analizzata, e il dato percentuale aggregato presente nel corrispondente totale di riga o di colonna. Misurare la forza di una relazione significa analizzare le posizioni dei casi all’interno delle caselle, una relazione è tanto più forte quanto più i casi si addensano in alcune caselle, oppure quanto le frequenze di alcune caselle stanno in una certa relazione con quelle di ogni altra. Si utilizzano indici di correlazione che generalmente vanno da -1 a 1, a seconda dell’intensità e del segno della relazione, 0 indica interdipendenza perfetta, 1 dipendenza perfetta. In questo lavoro adottiamo un indice di relazione definito Phi quadro (φ2) che presenta il vantaggio di essere indipendente dal numero di casi esaminati e quindi rende comparabili in modo omogeneo i risultati relativi a tabelle diverse. Il Phi quadro varia teoricamente tra 0, nel caso di relazione nulla, a 2, nel caso di relazione perfetta. Analizzando dati reali questi due estremi non si presentano mai, è possibile identificare una relazione con intensità massima di circa 0,85 [7]. Considerando il numero di casi esaminati in questo lavoro è stata considerato ragionevole ipotizzare una relazione quando il Phi quadro di una tabella supera il valore di 0,3. L’analisi bivariata e il calcolo dell’intensità della relazione attraverso il Phi quadro è stato realizzato utilizzando il software statistico SPSS [8]. Con quale tipo di link si trova più a suo agio in un sito web? Phi quadro = 0,416 Totale Un menu scritto una rappresentazione grafica dipende dal tipo di link non saprei Quanto spesso scarica un plug-in quando questo è richiesto per vedere una pagina web? solamente se sono molto in più Sempre dell’80% dei interessato al mai contenuto del casi sito A B C D Totale E % del totale 1 1,90% 5,20% 22,10% 0,60% 29,90% % del totale 2 2,60% 3,90% 9,70% 0,60% 16,90% 3 2,60% 7,80% 38,30% 1,90% 50,60% 0,60% 1,30% 70,80% 4,50% 100,00% % del totale % del totale % del totale 4 0,60% 5 7,80% 16,90% 2,60% Tab.1 – Analisi bivariata delle risposte alle domande 14, “Con quale tipo di link si trova più a suo agio in un sito web?”, e 15, “Quanto spesso scarica un plug-in quando 99 questo è richiesto per vedere una pagina web?”. Un esempio del risultato di questo lavoro di analisi si può vedere incrociando i dati della domanda 14, relativa al tipo di link che l’utente preferisce trovare sul web, e della domanda 11, relativa a quanto spesso l’utente afferma di scaricare i plug-in. Ci si potrebbe attendere che gli utenti più attratti da link costituiti da immagini siano maggiormente propensi a scaricare questi software, che sono spesso indispensabili per vedere immagini animate, ambienti tridimensionali e sono molto utilizzati nei siti che fanno delle immagini la loro forma principale di comunicazione. Infatti la percentuale di utenti che dichiarano di scaricare i plug-in spesso o in più dell’80% dei casi è più elevata tra gli utenti che dichiarano di ritenere le immagini la forma di link più interessante sul web (Tab.1, A2,B2). Per rendersene conto è necessario comparare la percentuale nella casella esaminata, ad esempio A2, con la percentuale totale della riga E2: se nel caso degli utenti che hanno dichiarato di preferire il testo sul web, il 29,9 % (Tab.1, E1), quelli che dichiarano di scaricare sempre i plug-in sono soltanto l’1,6 % (Tab.1, A1), nel caso degli utenti che dichiarano di essere più interessati alle immagini, complessivamente il 16,9 % (Tab.1, E2), quelli che dichiarano di scaricare i plug-in sono invece il 2,6 % (Tab.1, E2). Nella tabella esaminata il dato più interessante è senza dubbio il fatto che una percentuale molto alta degli utenti, il 38,3 % (Tab.1, C3), afferma che la forma di link preferito dipende dal contesto e lega lo scaricamento di un plug-in ad un grado elevato di interesse per il sito. Questo comportamento nei confronti dei plug-in è prevalente tra tutti gli utenti, come era evidente già esaminando i dati della sola domanda sui plug-in (Vedi tab.10), ma incrociando i dati abbiamo ottenuto un’ulteriore informazione, quasi il 40 % degli utenti afferma di preferire una forma non invasiva (senza uso di plug-in) e ritiene che il tipo di link sia fortemente contestuale. Per comprendere che cosa vuol dire che il tipo di link dipende dal contesto possiamo incrociare con i risultati della domanda 14, sul tipo di link preferito, con le domande 15 e 16, circa le motivazioni che spingono gli utenti ad attivare un collegamento testuale piuttosto che uno rappresentato da un’immagine. Quale è il motivo che la spinge solitamente a cliccare su di un’immagine: Con quale tipo di link si trova più a suo agio in un sito web Phi quadro: 0,315 Un menu scritto una rappresentazione grafica dipende dal tipo di link 1 Totale È facile da è rapida da è familiare non saprei individuare comprendere A B C D E 11,90% 10,10% 2,50% 4,40% 28,90% 2 6,90% 6,30% 2,50% 1,30% 17,00% 3 10,70% 27,00% 4,40% 9,40% 51,60% non saprei 4 0,60% 0,60% 1,30% 2,50% Totale 5 43,40% 10,10% 16,40% 100,00% Tab.2 – Analisi bivariata delle risposte alle domande 14, “Con quale tipo di link si trova più a suo agio in un sito web?”, e 15, “Quale è il motivo che la spinge solitamente a cliccare su di un’immagine:”. 100 Con quale tipo di link si trova più a suo agio in un sito web Phi quadro: 0,389 Un menu scritto una rappresentazione grafica Quale è il motivo che la spinge solitamente a cliccare un link testuale: Contiene contiene Totale contiene una dettagli sulle esattamente non parola chiave pagine che ciò che sto saprei nota seguono cercando A B C D E 3,10% 4,20% 17,70% 2,10% 27,10% 1 2 2,10% 2,10% 9,40% 13,50% dipende dal tipo di link 3 11,50% 6,30% 33,30% 5,20% 56,30% non saprei 4 1,00% 2,10% Totale 5 5 12,50% 61,50% 9,40% 100,00% 3,10% Tab.3 – Analisi bivariata delle risposte alle domande 14, “Con quale tipo di link si trova più a suo agio in un sito web?”, e 16, “Quale è il motivo che la spinge solitamente a cliccare un link testuale:”. Attraverso la tabella 29 è possibile osservare che tra il 51 % degli utenti secondo cui la forma di link preferita dipende dalla tipologia (Tab.2, E3), il 27 %, più della metà, afferma di scegliere un’immagine perché è rapida da comprendere (Tab.2, B3), il 10,7 % perché è facile da individuare (Tab.2, A3). Alla stessa domanda il 33,3 % degli utenti afferma che la preferenza va ai link testuali quando essi contengono esattamente ciò che si sta cercando (Tab.3 C3), mentre l’11,5 % afferma che l’attenzione cade sui link testuali quando questi contengono dettagli sulle pagine seguenti (Tab.3 A3). Sulla base dell’analisi sin qui condotta è quindi possibile affermare che più del 50 % degli utenti non esprime una preferenza netta per un tipo di link ma predilige il linguaggio visivo per la sua immediatezza mentre considera i link testuali più analitici e in grado di offrire un dettaglio informativo superiore. Pensa che sia possibile visitare una collezione d’arte esclusivamente sul web? Totale Al momento è coinvolto/a a qualsiasi titolo in attività artistiche? Phi quadro: 0,722 Mancata risposta No Si A B C D 1,20% 2,50% Mancata risposta 1 No 2 19,90% 14,90% 34,80% Si 3 26,10% 36,60% 62,70% Totale 4 46,00% 52,80% 100,00% 1,20% 1,20% Tab.4 – Analisi bivariata delle risposte alle domande 17, “Al momento è coinvolto/a a qualsiasi titolo in attività artistiche?”, e 25, “Pensa che sia possibile visitare una collezione d’arte esclusivamente sul web?”. Incrociando i dati relativi alla domanda 17, “Al momento è coinvolto/a a qualsiasi titolo in attività artistiche?”, e alla domanda 25, “Pensa che sia possibile visitare una 101 collezione d’arte esclusivamente sul web?”, è possibile osservare una relazione piuttosto forte (Tab.4, Phi quadro: 0,722) tra il coinvolgimento in attività artistiche e il ritenere possibile la fruizione di opere d’arte esclusivamente in modo virtuale attraverso lo schermo di un computer. Osservando la tabella si comprende la ragione di una relazione così marcata: il 19,9 % degli utenti che dichiara di non essere coinvolto in attività artistiche afferma anche di non ritenere possibile la visita ad una collezione d’arte attraverso il web (Tab.4, B2), contro il 14,9 % che lo ritiene possibile (Tab.4, C2). Tra gli utenti che si dichiarano invece coinvolti in attività artistiche è prevalente il numero di chi si esprime a favore della possibilità di fruire arte esclusivamente attraverso il web, il 36,6 % degli utenti (Tab.4, C3), rispetto al 26,1 % che sceglie l’opzione opposta (Tab.4, B3). Incrociando i dati si ottiene quindi un’informazione importante: gli utenti che hanno esperienza in ambito artistico tendono a considerare più favorevolmente il web come media per le arti visive rispetto agli utenti che invece dichiarano esperienza diretta come produttori di opere d’arte. Con quale tipo di link si trova più a suo agio in un sito web? Che tipo di informazione preferisce trovare in un sito web? Totale A una rappresent. Grafica B 1 17,40% 6,60% 23,40% 0,60% 47,90% immagini 2 7,80% 7,80% 16,80% 0,60% 32,90% Video 3 0,60% 1,20% 2,40% 4,20% audio ambienti 3D non saprei 4 0,60% 0,60% 1,80% 5,40% Phi quadro: 0,311 Testo scritto Totale Un menu scritto 5 1,80% 1,80% dipende dal tipo di link non saprei C D E 6 0,60% 0,60% 7,20% 0,60% 9,00% 7 28,10% 18,00% 52,10% 1,80% 100,00% Tab.5 – Analisi bivariata delle risposte alle domande 13, “Che tipo di informazione preferisce trovare in un sito web?”, e 14, “Con quale tipo di link si trova più a suo agio in un sito web?”. Confrontando la forma di link che gli utenti affermano di prediligere con il tipo di contenuto ritenuto più interessante è possibile osservare che tra gli utenti che affermano di preferire un contenuto testuale sul web sono in maggioranza quelli che affermano di preferire un link testuale, circa il 17,4 % (Tab.5, A1), rispetto a quelli che ritengono preferibile un link costituito da un’immagine, circa il 6,6 % (Tab.5, A2). Tra gli utenti che invece ritengono più interessante un contenuto costituito da immagini la stessa percentuale di utenti si suddivide tra la preferenza per la forma testuale e quella di immagine dei link, il 7,8 % (Tab.5, A2, B2). Nella tabella 32 risultano numericamente superiori le percentuali relative agli utenti per i quali la forma di link dipende dal contesto (Tab.5, C1, C2), in quanto questa è stata la risposta prevalente (Tab.5, C7), ma in questo caso il dato interessante è la distribuzione delle risposte nelle prime quattro caselle della tabella (Tab.5, A1, B1, A2, B2), in base alle quali possiamo affermare che la forma testuale è preferita durante la navigazione tra 102 le pagine web anche dalla maggior parte degli utenti che affermano di essere più interessati a un contenuto non testuale all’interno dei siti web. Solo il 7,6 % degli utenti cui è stato somministrato il questionario infatti afferma di preferire le immagini tanto come contenuto quanto come strumento di navigazione. Che tipo di informazione preferisce trovare in un sito web? Totale Testo ambienti non immagini Video audio scritto 3D saprei A B C D E F G Al momento è coinvolto/a a qualsiasi titolo in attività ti ti h ? Phi quadro: 0,249 Mancata 1 risposta No 2 1,20% 0,60% 21,60% 9,60% 0,60% 0,60% 2,40% 34,70% Si 3 25,10% 22,80% 3,60% 0,60% 4,80% 6,60% 63,50% 4 47,90% 32,90% 4,20% 0,60% 5,40% 9,00% 100,0% Totale 1,80% Tab.6 – Analisi bivariata delle risposte alle domande 17, “Al momento è coinvolto/a a qualsiasi titolo in attività artistiche?”, e 13, “Che tipo di informazione preferisce trovare in un sito web?”. Se al momento è coinvolto/a a qualsiasi titolo in attività artistiche, per cortesia indichi il settore: arti Multidanza grafica musica design Altro visive media A B C D E F G Che tipo di informazione preferisce trovare in un sito web? Phi quadro: 0,498 Totale H Testo scritto 1 0,90% 11,10% 13,90% 3,70% 3,70% 2,80% 3,70% 39,80% immagini 2 2,80% 36,10% Video 3 1,90% audio ambienti 3D 4 0,90% 5 0,90% 1,90% 2,80% non saprei 6 1,90% 1,90% 2,80% 2,80% Totale 12,00% 7,40% 11,10% 1,90% 0,90% 1,90% 1,90% 5,60% 0,90% 0,90% 7 0,90% 25,90% 27,80% 22,20%10,20% 4,60% 0,90% 7,40% 0,90% 10,20% 8,30% 100,00% Tab.7 – Analisi bivariata delle risposte alle domande 13, “Che tipo di informazione preferisce trovare in un sito web?”, e 17-18, “Se al momento è coinvolto/a a qualsiasi titolo in attività artistiche, per cortesia indichi il settore:”. Come era prevedibile gli utenti che dichiarano di essere coinvolti in attività artistiche tendono a dichiararsi più spesso interessati alle immagini come forma di comunicazione sul web (Tab.6, A2, A3, B2, B3), anche se molti sono gli utenti che ritengono più interessante l’informazione testuale. Infatti il coefficiente di relazione tra il coinvolgimento in attività artistiche e il tipo di informazione ritenuta più interessante è basso (Tab.6, Phi quadro: 0,249). Nel caso degli utenti che affermano di occuparsi di multimedia (Tab. 6, C7) il 13,9 % afferma di considerare la forma testuale quella più interessante (Tab. 6, C1), mentre solo il 7,4 % dichiara di essere maggiormente attratto dalle immagini (Tab. 6, C2). 103 Anche in questo caso l’analisi dei dati conferma la forza della forma testuale rispetto a quella non testuale anche tra i soggetti che dichiarano di occuparsi direttamente di arte. Testo scritto Scelga uno dei seguenti gruppi di parole che meglio rappresenta il suo concetto di cyberspace * Spazio solido Spazio Spazio solido Totale focalizzato Spazio Spazio elettronic focalizzato testuale fluido sulle o sui nodi connessioni * * * * * A B C D E F 1 19,30% 4,30% 6,80% 0,60% 16,10% 47,20% immagini 2 9,90% Video 3 1,20% audio 4 0,60% ambienti 3D 5 non saprei 6 1,90% 0,60% 1,20% 7 32,90% 9,30% 14,30% Che tipo di informazione preferisce trovare in un sito web? Phi quadro: 0,365 Totale 4,30% 4,30% 1,90% 13,00% 33,50% 1,20% 0,60% 1,20% 4,30% 0,60% 0,60% 0,60% 3,70% 4,30% 5,60% 5,00% 8,70% 39,80% 100,00% Tab.8 – Analisi bivariata delle risposte alle domande 13, “Che tipo di informazione preferisce trovare in un sito web?”, e 19, “Scelga uno dei seguenti gruppi di parole che meglio rappresenta il suo concetto di cyberspace”. * Per gli ambiti semantici cui fanno riferimento le definizioni di cyberspazio il riferimento è a Hutchison 1996 [5] Analizzando il rapporto tra forme di comunicazione e concetto di cyberspazio, identificato attraverso i campi semantici proposti da Chris Hutchison (Hutchison 1996), si può individuare una relazione tra la preferenza degli utenti per il testo e la metafora testuale e quella definita dello spazio solido focalizzato sui nodi (Tab.8, A1, E1). Tra il 32,9 % degli utenti che vede il web attraverso una metafora di tipo testuale (Tab.8, A7), il 19,3 % considera infatti il testo scritto la forma di comunicazione più interessante (Tab.8, A1). All’interno del 39,8 % che guarda al web attraverso la metafora di uno spazio solido (Tab.8, E7), gli utenti che scelgono la forma testuale sono il 16,1 % (Tab.8, E1), la relazione non appare così forte (Tab. 35, Phi quadro: 0,365) perché hanno una rilevanza numerica significativa anche gli utenti che, pur distribuendosi sulle due metafore esprimono maggior interesse per l’informazione veicolata dalle immagini (Tab.8, A2, E2). Incrociando i dati sul tipo di informazione che gli utenti affermano di prediligere e quelli sulle motivazioni che spingono a visitare un sito web ancora una volta evidenziare atteggiamenti tendenzialmente difformi in relazione alla prima variabile. 104 Che tipo di informazione preferisce trovare in un sito web? B Testo scritto 1 11,80% 16,10% C D E 1,90% altro A vedere che cosa cambia nel sito Phi quadro: 0,463 ottenere maggiori informazioni dopo una visita essere aggiornato sugli eventi e sulle iniziative del Museo o della Galleria visitare collezioni d’arte che non sono visibili al pubblico Avere informazioni sugli orari e sui servizi avere inform. Sulle collezioni d’arte prima della visita Quale motivazione la spinge a visitare il sito di un museo o di una galleria d’arte? F G Totale H 6,80% 3,70% 3,10% 4,30% 47,80% immagini 2 3,70% 8,10% 8,70% 9,30% 1,20% 2,50% 33,50% Video 3 1,90% 0,60% 1,90% 4,30% audio 4 ambienti 3D 5 0,60% 0,60% 1,20% 1,90% 0,60% 5,00% non saprei 2,50% 1,20% 1,20% 0,60% 8,70% Totale 6 3,10% 0,60% 7 19,30% 29,20% 1,90% 0,60% 18,60% 18,60% 5,60% 6,80% 100,0% Tab.9 – Analisi bivariata delle risposte alle domande 13, “Che tipo di informazione preferisce trovare in un sito web?”, e 24, “Quale motivazione la spinge a visitare il sito di un museo o di una galleria d’arte?”. Le motivazioni che possono essere messe in relazione con un tipo di informazione testuale sono indicate tendenzialmente dagli utenti che scelgono questa forma (Tab.9, A1, B1), mentre quando le motivazioni hanno un maggiore legame con la dimensione visuale esse sono indicate prevalentemente da utenti che affermano di prediligere le immagini (Tab.9, D2, E2). Un’ulteriore conferma a questa lettura dei dati è offerta dal fatto che esistono motivazioni legate ad informazione con una forte prevalenza della forma testuale, come gli orari di apertura o i servizi offerti, e informazioni con una forte prevalenza della forma visiva, come la visita a collezioni d’arte che non sono visibili al pubblico, rispetto alle quali la tendenza degli utenti è molto netta, con una distribuzione quasi speculare (Tab.9, A1, A2 vs. E1, E2). Nel caso invece di motivazioni legate a informazione in cui la prevalenza di una forma sull’altra è meno netta, come le informazioni sulle collezioni d’arte prima della visita, oppure l’aggiornamento su eventi e iniziative, rispetto alle quali la distribuzione assume una connotazione meno netta (Tab.9, B1, B2 vs. D1, D2). Sui vantaggi che possono giustificare la visita esclusivamente virtuale a una collezione d’arte sembra invece che non si evidenzino difformità tra gli atteggiamenti degli utenti che propendono per la forma testuale oppure visiva dell’informazione (Tab.10, E1, E2, F1, F2). I vantaggi di una visita attraverso il web sono quelli collegati alle caratteristiche intrinseche del mezzo: l’interattività, la disponibilità a veicolare molta informazione in forme diverse e la personalizzazione della visita sulla base del profilo del singolo utente. 105 Che tipo di informazione preferisce trovare in un sito web? A Testo scritto immagini Video audio ambienti 3D B C D 3,20 1 5,40% 7,50% 1,10% % 4,30 2 3,20% 4,30% % 1,10 3 % 1,10 4 % 1,10 5 1,10% % non saprei 6 2,20% Totale 1,10% E F 11,80% 15,10% 8,60% 10,80% 1,10% 1,10% G H 3,20 47,30% % 5,40 36,60% % 1,10 4,30% % 1,10% 1,10% 1,10% 10,80 7 10,80% 14,00% 1,10% % Totale Altro la collezione virtuale può presentare le opere d’arte da diverse prospettive (Contesto storico, critica, confronto con altre opere) la collezione virtuale può essere adattata al visitatore fornendo più o meno informazioni attraverso media diversi in relazione con il profilo del visitatore (Personalizzazione) i visitatori possono essere soli durante la visita virtuale la collezione virtuale sul web non ha orari di chiusura la collezione virtuale sul web può fornire più informazioni Phi quadro: 0,488 Le opere non possono essere esposte al pubblico Se pensa che sia possibile visitare una collezione d’arte esclusivamente sul web, qual è il principale vantaggio? 22,60% 3,20% 3,20% 7,50% 31,20% 9,70 100,0% % Tab.10 – Analisi bivariata delle risposte alle domande 13, “Che tipo di informazione preferisce trovare in un sito web?”, e 26, “Se pensa che sia possibile visitare una collezione d’arte esclusivamente sul web, qual è il principale vantaggio?”. 5. Profili Attraverso i risultati del questionario è possibile disegnare alcuni profili di utilizzo del web come medium per le arti visive. La dimensione testuale dell’informazione è importante per tutti gli utenti, la ricerca delle informazioni avviene in maniera quasi esclusivamente testuale, l’interesse per la dimensione visiva è prevalente nella fase preliminare di navigazione tra le pagine del sito, quando non si dispone di elementi testuali in grado di orientare la ricerca, e nella fase conclusiva, quando l’oggetto di interesse non è testuale, come nel caso di un quadro, un disegno o una fotografia. La centralità della metafora testuale è confermata dai campi semantici individuati in riferimento al web: tra i cinque ambiti semantici proposti la preferenza va nettamente al testo e al luogo fisico. Queste due metafore prevalgono anche in relazione alla loro maggior familiarità con il mondo reale dell’utente. La maggior parte dell’informazione veicolata attraverso il web è ricevuta attraverso la lettura, mentre la sua articolazione è spesso percepita attraverso una metafora spaziale, si parla infatti di architettura del sito, di mappa dei contenuti, la stessa veste grafica del 106 sito rimanda talvolta ad un edificio reale o ipotetico. Lo spazio del testo è il principio organizzatore della pagina, elemento base, almeno ad ora, della navigazione web, mentre lo spazio fisico organizza la struttura dei collegamenti tra le pagine, dove la continuità caratteristica del testo si destruttura nella dimensione ipertestuale. Gli utenti tendono a considerare i siti web dedicati alle arti come fonte di informazione sui luoghi deputati all’esposizione delle opere, prevalentemente vi ricercano informazioni di servizio come orari di visita e informazioni sulle opere presenti prima di una visita reale, molti utenti visitano ripetutamente lo stesso sito, sviluppano quindi una familiarità che consente loro di orientarsi anche in strutture articolate. La presentazione di opere d’arte sul web è apprezzata quando offre qualcosa che gli spazi espositivi reali non sono in grado di offrire. In particolare la possibilità di vedere opere non esposte, oppure informazioni che non è possibile presentare durante la visita ad un museo. Una collezione d’arte on-line deve sfruttare le possibilità del mezzo creando un ambiente interattivo in grado di organizzare l’informazione di contesto alle opere in modo personalizzato sulla base delle scelte dell’utente. L’organizzazione delle opere non deve seguire l’organizzazione della collezione nella realtà ma offrire prospettive diverse, magari attraverso la relazione tra opere collocate in luoghi fisici differenti oppure tra elementi delle opere che non sono apprezzabili dal pubblico in una visita tradizionale. L’interattività è apprezzata in modo particolare se collegata alla dimensione testuale, l’utilizzo strumenti particolari supportati da plug-in è accettabile solo se collegato ad una funzione specifica ad elevato contenuto informativo, come ad esempio le funzioni di ingrandimento o la visione tridimensionale di un’opera. È possibile sintetizzare parte del lavoro fatto sin qui disegnando due stili di navigazione, che non rappresentano il comportamento di tipologie differenti di individui ma piuttosto due estremi di un continuum all’interno del quale il comportamento dell’utente tende a variare in rapporto agli obiettivi e al contesto. Caratteristica Contesto e motivazioni Caratteristich e dei collegamenti Tipologia di informazione Navigazione testuale Analitica Navigazione visuale Olistica Esplorazione senza un fine preciso oppure fruizione di informazioni che hanno ricerca di informazione focalizzata su un concetto esprimibile in forma testuale natura intrinsecamente visuale al termine di una ricerca o una navigazione Precisione e pertinenza delle definizioni, Facilità di lettura, rapidità di alto contenuto informativo circa i individuazione, coerenza. contenuti che seguono Informazione emozionale, interattività Informazione di servizio, tutta focalizzata su caratteristiche salienti l’informazione che può avere forma dell’oggetto rappresentato, segno testuale. convenzionali/iconici. Tab.11 - sintesi dei profili utente Possiamo pensare ad un utente che percorre le pagine, esplora, ricerca assumendo di volta in volta atteggiamenti e azioni che si rifanno ad una percezione analitica oppure olistica del complesso messaggio costituito da uno spazio web. Vedremo in seguito come molti siti siano più o meno consapevolmente costruiti proprio in modo 107 da agevolare l’adozione di uno stile di navigazione rispetto all’altro a seconda del livello di profondità e del contesto. REFERENCES [1] Chris S. Hutchison, Paolo Raviolo, Review of Visual Art Representation and Communication on the Web, Procedings of ICHIM01 (International Cultural Heritage Informatics Meeting 2001), edited by: David Bearman e Franca Garzotto, 2001, vol. 1, pp.247261 [2] James P. Spradley, Participant observation, Londra, Holt, Rinhart and Wilson, 1980 [3] Peter L. Brusilovsky, Methods and techniques of adaptive hypermedia, User Modelling and User Adapter Interaction 6, 1996, pp.87-129 [4] Noam Chomsky, Language and Mind. New York, Harcourt Brace Janovic, 1972 [5] Chris S. Hutchison, A sense of place: The digital Museum, Proceedings of EVA 96 (Conference and Exhibition on Electronic Imaging and the visual arts), Paris, 4th-6th December, 1996 [6] Kenneth D. Bailey, Methods of social research, The Free Press, New York 1982, It. trans.: Metodi della ricerca sociale, Il Mulino, Bologna 1995. p. 461 [7] Kenneth D. Bailey, Methods of social research, The Free Press, New York 1982, It. trans.: Metodi della ricerca sociale, Il Mulino, Bologna 1995. pp. 504-511 [8] http://www.spss.com 108 Automatic Lightness and Color Adjustment of Visual Interfaces A. Rizzi, C. Gatta, M. Maggiore, E. Agnelli, D. Ferrari, D. Negri Department of Information Technology – University of Milano Via Bramante, 65 – 26013 Crema (CR) - Italy Abstract In this paper we propose the use of an algorithm inspired by some human visual system adaptation mechanisms and designed for digital image enhancement, as visual interfaces automatic valuator. It is able to suggest alternative color, lightness and contrast configurations, maximizing the interface visual information content and its readability. Moreover, in some cases it is able to highlight undesired perceptual effects as readability or contrast problematic configurations. Test and results on several commercial interfaces are presented. A perceptual approach for interfaces assessment Software requires a good interface, not only in terms of spatial distribution of commands or function readability and grouping, but also in terms of color and text visual appearance. In this paper we propose the use of an algorithm designed for digital image enhancement as visual interfaces automatic valuator. This algorithm, called ACE for Automatic Color Equalization, inspired by some human visual system adaptation mechanisms, is able to correct the image color and maximize its overall dynamic with no user control or a-priori information. In particular, ACE perform a filtering basing upon the original image information and its relationship. ACE structure is subdivided in two steps: the first modify the image using the above mentioned perceptual mechanisms while the second maps the intermediate results into the available dynamic (e.g. monitor). A detailed description of ACE model and its implementation can be found in [1][2][3]. The output of ACE interface filtering suggests alternative color, lightness and contrast configurations, maximizing the interface visual information content and its readability. As far as the authors know, this is a new approach. Test and results This approach has been tested on several commercial software interfaces. The original and filtered interfaces have been displayed on two different calibrated CRT 109 monitors to a population of hundred users and their preferences have been collected. In particular, users have been asked to choose the preferred interface in terms of lightness, color palette, readability and overall pleasantness. Users have been classified in terms of time of computer usage (never, occasionally, mean, often, every day), skills in computer usage (novice, medium, expert) and gender (male, female). Results on 1000 users are shown on the following chart. % of preference 60 50 40 Original ACE filtered 30 20 10 0 Brightness Palette Readability Overall In terms of number of preferences, interfaces modified by ACE are preferred in brightness and readability. On the contrary color preferences decrease. An example of the ACE filtering effect is visible in Fig.1. Fig. 1: A web interface screenshot (www.msn.it) and the relative ACE output. The original palette of Fig. 1 is quite pleasant and the readability is satisfactory but the left bottom menu has white text on light blue background, this choice can lead to readability problems; ACE gives a result in which the background has been modified to a dimmer blue, increasing in this way the text readability. Also the overall lightness is modified; ACE output is dimmer than the original, this results in a more balanced use of monitor intensities. The tested interfaces have been clustered in 4 categories: interfaces based upon standard windows GUI (e.g. WinXP control panel), interfaces with particular design (e.g. Windows Media Player or other multimedia players), interfaces with text and figures (e.g. Encarta99 encyclopedia or video games menu screens) and web pages (e.g. www.msn.it). Cross correlations with these clusters can lead to preliminary comments about complex human factors: e.g., when asked to choose the preferred color palette the user may be influenced by the fact that standard windows GUI is usually recognized to have a familiar (usually standard) palette, a different combination of colors may induce the following user standpoint: “This palette seems very strange respect to the standard” and it can result in a biased preference. 110 Moreover, these clusters permit a qualitative evaluation of ACE performance in predicting undesired configuration. Unsatisfactory results are usually correlated with interfaces that have large uniform areas; on the other hand some ACE capabilities could be highlighted applying it only to particular clusters. In some cases this process is able to highlight undesired perceptual effects as readability or contrast problematic configurations. After ACE application, these configurations can be corrected; the algorithm itself suggests possible solutions in the output image (e.g. the interfaces in Fig. 1). To conclude, we present a case of game interface, visible in Fig. 2, before (left) and after ACE filtering (right). Delta Force Black Hawk Down 80 Original 60 40 ACE filtered 20 0 Brightness Palette Readability Overall Fig. 2: A game interface, the relative ACE output and the evaluation results. In this case, as it can be noticed from the relative table, the interface modified by ACE has been preferred under every point of view. Summarizing, we have proposed the use of ACE as a tester for visual interfaces. This approach does not aim to eliminate designer role in building visual interfaces but it can be used as an automatic prompter. Applying ACE results in an alternative interface, with modified contrast and colors, that can be used as suggestion for possible interface refinement. References [1] A. Rizzi, C. Gatta, D. Marini, “A New Algorithm for Unsupervised Global and Local Color Correction”, Pattern Recognition Letters, Vol 24 (11), pp. 1663-1677, July 2003. [2] A. Rizzi, C. Gatta, D. Marini “Color Correction between Gray World and White Patch”, Electronic Imaging 2002, 20-25/01/02, S. Josè, California (USA). [3] C. Gatta, A. Rizzi, D. Marini, “ACE: an Automatic Color Equalization algorithm” CGIV02 the First European Conference on Color in Graphics Image and Vision, 2-5/4/2002, Poitiers (France). 111 Developing Affective Lexical Resources Alessandro Valitutti, Oliviero Stock, and Carlo Strapparava ITC-irst, Via Sommarive 18, 38050 - Povo - Trento, Italy {stock, strappa, alvalitu}@itc.it http://tcc.itc.it/people.html 1 Introduction Affective computing is advancing as a field that allows a new way of human computer interaction, in addition to the use of natural language. There is a wide perception that the future of HCI is in themes such as entertainment, emotions, aesthetic pleasure, motivation, attention, engagement, etc. Studying the relation between natural language and affective information and dealing with a computational treatment is becoming crucial. According to recent research, computers’ affective ability plays a vital role in improving interaction with the users. This ability depends not only on the affective expressiveness, but also on the capacity to detect the affective state of the user. Researchers have tried detecting the user’s affective state in many ways, such as, through facial expressions, speech, physiology, and text. In particular, text is an important modality for sensing affect because the bulk of computer user interfaces today are textually based. Examples of such applications include synthetic agents that want to give affective response to the user input at the sentence level (e.g. an affective text analyzer architecture [6]), affective text-tospeech systems, etc. In other applications, the affective interaction is finalized to influence the emotional state of the user. We put efforts in this direction, in particular addressing some aspects of computational humour. NLP was exploited to build systems that were capable of inducing amusement and affecting the emotional state of users (e.g. HAHAcronym [11]). For all these applications it is necessary to have a linguistic resource containing affective knowledge. Unfortunately, actually there is not a sufficiently wide resource of this type. The present paper covers the relation between lexicon and affective concepts. We developed a preliminary version of a lexical resource (that we called Affect), containing words in an affective lexicon. Then, considering the lexical knowledge base WordNet, we linked the senses of the words in Affect, then called WordNet-Affect. Finally, we have taken into account OpenMind, a database of common sense sentences, in which there is a considerable common sense knowledge [10]. Exploiting WordNet-Affect and a word sense disambiguation algorithm [8], we automatically chose an “affective-oriented” subset of OpenMind, named OpenMind-Affect. OpenMind-Affect is composed of sentences, patterns, and parsing trees containing affective concepts. In this way 112 the affective lexicon is enriched by “contextual words”, which do not directly refer to affective state (emotions and mood), but that are meaningful from an affective point of view. 2 Affect Affect is a lexical database containing 1,903 terms directly or indirectly referring to mental (e.g. emotional) states . The main part of Affect consists of nouns (539) and adjectives (517). There is a smaller number of verbs (238) and a tiny set of adverbs (15). In order to collect this material, we started from an initial set of psychological adjectives (in particular, affective adjectives). The collection was extended with the help of dictionaries. In a second step, the nouns were added through an intuitive correlation with the adjectives. In a similar way, verbs and adverbs were added . For each item a frame was created in order to add lexical and affective information. Lexical information includes the correlation between English and Italian terms, parts of speech (pos), definitions, synonyms and antonyms. The attribute posr relates terms having different pos but pointing to the same psychological category. For example, the adjective cheerful is semantically linked to the name cheerfulness, to the verb cheer up and to the adverb cheerfully. Affective information is a reference to one or more of the three main kinds of theories on emotion representation: discrete theories (based on the concept of cognitive evaluation), basic emotion theories and dimensional theories. According to the work of Ortony et al. [9], terms are classified in emotional terms, non-emotional affective terms (e.g. mood) and non affective mental state terms. Other terms are linked with personality traits, behaviors, mental attitudes, physical or bodily states and feelings (such as pleasure or pain). Some examples terms and their category are given in Table 1. Table 1. Categories and terms. Category Example Term Emotion anger Cognitive state doubt Personality competitive Behaviour cry Mental attitude scepticism Feeling pleasure Discrete emotional information is characterized by an attribute whose value corresponds to one of the 24 emotional categories described by Elliot [2]. Another attribute allows us to indicate one of the six basic emotions cited by Ekman [3]. Dimensional emotional information needs two attributes denoting valence (that is, how positive or negative a fixed emotional state is) and arousal (that is the level of emotional excitation). 113 Part of the information was collected from dictionaries and from scientific documentation on the psychology of emotions; the remaining information was inserted on an intuitive and arbitrary basis. The former kind of data was associated with references to the sources, the latter is the rough material for a subsequently critical review (for example, by psychologists or lexicologists). As an example, here is one of the frames from the database: [name]: anger [ita]: <rif src=c>rabbia, collera</rif> <rif src=wn sense=1>ira, collera, arrabbiatura, rabbia</rif> <rif src=wn sense=2> collera, ira, bile, furia, rabbia</rif> [def]: <rif src=wn sense=1> (Psychology) a strong emotion; a feeling that is oriented toward some real or supposed grievance</rif> <rif src=wn sense=2> (Physiology) the state of being angry</rif> [synonyms]: <rif src=wn sense=1>choler, ire</rif> <rif src=wn sense=2>angriness</rif> <rif src=mw> fury, indignation, ire, mad, rage, wrath</rif> [antonyms]: <rif src=mw>forbearance</rif> [pos]: n [posr]: <v>anger</v> <a>angry</a> <r>angrily</r> [fundamental]: <rif src=d>a</rif> [elliot]: anger [valence]: [arousal]: 2 [ortony]: emotion [notes]: 3 WordNet-Affect WordNet is a lexical database for English developed at the Princeton University [4]. The basic lexical relationship in WordNet is synonymy. Groups of synonyms are used to identify lexical concepts, which are also called synsets. The availability of a large sense repository such as WordNet [4], has made it possible to represent affective concepts as synsets in WordNet. So we projected the affective information of the words in the Affect database onto the corresponding senses of WordNet. The opportunity to interface Affect with WordNet allows us to outline different developments. On one hand it is possible to extend the collection through a search of synonyms and antonyms (performed on each of the terms 114 of Affect that are contained in WordNet). On the other hand it is useful to compare the affective information of the database with WordNet hyperonym hierarchy restricted to the Psychology domain [7], in order to propose an enrichment in the structure of this semantic field. After analyzing information provided by WordNet about the synsets that contain words of Affect, we concluded that synsets can be used to represent affective concepts. In fact, at the level of single affective concepts we think that the characterization as synsets is quite accurate. On the other hand, additional WordNet relations, such as the ISA hierarchy, do not seem to be always appropriate from an affective point of view. We have given the name WordNet-Affect to the subset of WordNet that includes 1,314 synsets representing the senses of the entries in Affect. # Synsets # Words # Nouns # Adjectives # Verbs # Adverbs # Total 535 557 200 22 1314 1336 1472 592 40 3340 Table 2. Number of affective synsets and words, grouped by part of speech, in WordNet-Affect. 4 OpenMind-Affect Once a fixed set of affective concepts has been identified, represented by the synsets of WordNet-Affect, it turns out useful to distinguish those directly referring to affective states from those denoting their causes and consequences. Then it is necessary to have a wide resource of common sense expressions whose prototypical knowledge allows us to extract contextual information (e.g. events that typically causes specific emotions). With reference to the work of Liu et al. [6], we have used OpenMind as a source of stereotypical knowledge. OpenMind is a wide common sense knowledge base containing sentences, linguistic patterns and parse trees. Unlike [6], sentences of OpenMind were annotated through a word sense disambiguation tool, developed at ITC-irst [8], in order to associate each word with the corresponding WordNet sense. In this way, it was possible to identify the sentences containing words with an affective meaning and to select an affectively significant subset of OpenMind, which we have called OpenMind-Affect. Using only words in Affect, we selected a set of 74,455 sentences in OpenMind. Using the words in Wordnet-Affect, we increased the size of this set to 171,657 sentences. This resource is employed as an environment for experimentation about an affective lexicon. In particular, we aim at obtaining the following results: 115 1. increasing the collection of affective concepts. To this aim, we need to identify, in the sentences of OpenMind-Affect, new words related to the well-known ones, in order to obtain new synsets to include in WordNetAffect. 2. collecting contextual information, such as events that typically cause specific emotions. 3. using contextual information in order to increase the affective knowledge of the lexical items, whenever possible. In order to extract the contextual knowledge, it is necessary to exploit some linguistic patterns to connect words denoting affective states with contextual words. For example, the pattern X causes Y, where X denotes an event and Y refers to an emotional state, allows us to identify a typical cause of that emotion. The lexical semantics of emotional adjectives [5] allows us to deduce some of these structures even if they are not explicitly present in the sentence. For example, the adjective cheerful in general may have different way for denoting the emotional state (stative, manifestative, causative). Nevertheless, if it is included in the noun phrase cheerful flower, it assumes the causative reading and implicitly express the fact the flower causes cheerfulness, which allows us to add a potentially new contextual concept to the set of the affective concepts. 5 Conclusions With the three resources illustrated above (Affect, WordNet-Affect, and OpenMind-Affect) we want to contribute towards a realization of an “affectiveoriented” lexical knowledge base. Future possible applications are related both to generation and to recognition. In the first case, we can think of systems dealing with the automatic production of affective expressions for advertising. As an example of the second case, we can conceive systems for the extraction of affective information from human verbal reports. In the field of human-computer interaction, a stimulating perspective is the realization of artificial agents that give affective responses to a user’s verbal input. The idea is that multimodal interfaces (e.g. embodied characters) may express their affective state in both linguistically and graphically manner, in order to induce affective reactions in the users. References 1. D’Urso, P., Trentin, B.: Psicologia delle emozioni. Ed. il Mulino, 1988. 2. Elliot, C. D.: The Affective Reasoner: a process model of emotions in a multi-agent system. Northwestern University, Evanston, 1992. 3. Ekman, P.: An argument for basic emotion. Cognition and Emotion, 6 (1992), 169– 200. 4. Fellbaum, C. (eds): Wordnet: an electronic lexical database. MIT Press, 1998. 116 5. Goy, A.: Lexical Semantics of Emotional Adjectives. In Feist, S., Fix, S., Hay, J., and Moore, J. (eds): Linguistics in Cognitive Science: Proceedings of Student Conference in Linguistics 10, MIT Working Papers in Linguistics 37, MIT Press, 2000. 6. Liu, H., Lieberman, H., Selker, T.: A Model of Textual Affect Sensing using RealWorld Knowledge. Proceedings of the Seventh International Conference on Intelligent User Interfaces (IUI 2003),pp. 125-132. Miami, Florida. 7. Magnini, B. and Cavaglià, G.: Integrating Subject Field Codes into WordNet. In: Gavrilidou, M., Crayannis, G., Markantonatu, S., Piperidis, S., Stainhaouer, G. (eds.): Proceedings of LREC-2000, Second International Conference on Language Resources and Evaluation Theoretical Aspects of Computer Software. Athens, Greece (2000), 1413-1418. 8. Magnini, B., Strapparava, C., Pezzulo, G., and Gliozzo, A.: The Role of Domain Information in Word Sense Disambiguation. Natural Language Engineering, 8(4):359373, 2002. 9. Ortony, A., Clore G. L., and Foss, M. A.: The psychological foundations of the affective lexicon. Journal of Personality and Social Psychology, 53, 751-766, 1987. 10. Singh, P., Lin, T., Mueller, E. T., Lim, G., Perkins, T., and Zhu, W. L. Common sense: Knowledge acquisition from the general public. In Proceedings of the First International Conference on Ontologies, Databases, and Applications of Semantics for Large Scale Information Systems, Heidelberg. Springer-Verlag, 2002. 11. Stock, O. and Strapparava, C.: Getting Serious about the Development of Computational Humor. To appear in Proceedings of International Joint Conference of Artificial Intelligence (IJCAI-03), Acapulco, Mexico. August 2003. 117 SAMIR: A 3D Web Intelligent Interface F. Zambetta, G. Catucci, F. Abbattista, G. Semeraro, M. Lamarca, F. De Felice Dipartimento di Informatica – Università di Bari -Via E. Orabona, 4 – I-70125 [email protected], [email protected], [email protected], [email protected], [email protected], [email protected] Abstract Intelligent web agents, that exhibit an autonomous behavior rather than a merely reactive one, are daily gaining popularity as they allow a simpler and more natural interaction between the user and the machine, entertaining him/her and giving to some extent the illusion of interacting with a human-like interface. Our system, SAMIR, is an intelligent web agent that uses: A 3D face animated via a morph-target technique to convey expressions to be communicated to the user, a slightly modified version of the ALICE chatterbot to provide the user with dialoguing capabilities, and an XCS classifier system to manage the consistency between conversation and the face expressions. Experimental results obtained applying SAMIR to a virtual bookselling scenario, involving a Web bookstore, are reported. Introduction Intelligent virtual agents are software components designed to act as virtual advisors into applications, especially web ones, where a high level of human computer interaction is required. Indeed, their aim is to substitute the classical WYSIWYG interfaces, which are often difficult to manage by casual users, with reactive and possibly pro-active virtual ciceros able to understand users’ wishes and converse with them, find information and execute non-trivial tasks usually activated by pressing buttons and choosing menu items. Frequently these systems are coupled with an animated 2D/3D look-and-feel, embodying their intelligence via a face or an entire body. This way it is possible to enhance users trust into these systems simulating a face-to-face dialogue, as reported in [1]. A general observation is that the state-of-the-art systems, though interesting, are often heavy to implement, difficult to port onto different platforms, and usually not embeddable in Web browsers. These reasons lead us to pursue a light solution, SAMIR (Scenographic Agents Mimic Intelligent Reasoning), which turns out portable, easy to implement and fast enough in medium-s ized computer environments. For a web-based solution with a very efficient client one could see []: Here the same 3D API we used was chosen but the system has no reasoning module and the 3D character achieves no autonomy during the interaction. SAMIR is a digital assistant where an artificial intelligence based Web agent is integrated with a purely 3D humanoid, robotic, or cartoon-like layout [2]. The remainder of the paper is organized as follows. The next section describes the architecture of SAMIR. In the third, fourth and fifth sections, the three main modules of SAMIR are detailed. Some examples of SAMIR in action are given in the sixth section. Finally, conclusions are drawn in the last section. 118 The Architecture of SAMIR SAMIR (Figure 1) is a client-server application, composed of 3 main sub-systems: The Dialogue Management System (DMS), the Animation Module and the Behavior Manager. The DMS is responsible for directing the flow of information in our system: When the user issues a request from the web site, an HTTP request is directed to the DMS Server to obtain the HTTP response storing the chatterbot answer. At the same time, based on the events raised by the user on the web site and on his/her requests, a communication between the DMS and the Behavior Manager is set up. This results into the expression the Animation System should assume, coded by a Fig. 1. The Architecture of SAMIR. set of coefficients for each of the possible morph targets [3] into our system: We use some high-level morph targets corresponding to the 6 fundamental expressions [4] but even lowlevel ones are a feasible choice in order to preserve full MPEG-4 compliance. After this interpretation step, a key-frame interpolation is performed to animate the current facial expression. The Animation Module The FACE (Facial Animation Compact Engine) module is an evolution of the Fanky Animation System [6]. We followed the same philosophy introduced in Fanky, that is the implementation of SACs (Standard Anatomic Components). The basic idea underlying them is to define face regions, acting as objects, in an object-oriented sense of the term. The offered services correspond to different low-level deformations such as FAPs (Facial Animation Parameters), used during the animation process, or face sculpting and remodeling. Moreover, SACs made possible to select the numerical method employed to deform the vertices associated to a particular SAC at runtime. We adopt the linear interpolation of a 3D face key-frames because, in our opinion, it represents the best compromise between speed and accuracy. FACE was conceived keeping in mind lightness and performance so that it supports a variable number of morph targets: For example we currently use either 12 high-level ones or the entire “low-level” FAP set, in order to achieve MPEG-4 compliance. Using just a small set of high-level parameters might be extremely useful when debugging the behavior module because it is easier to reason about behavioral patterns in terms of explicit expressions rather than longer sets of facial parameters. An unlimited number of timelines can be used allocating one channel for the stimulus-response expressions, another one for eye-lid non-conscious reflexes, and so on. We are currently developing a custom editor able to perform the same tasks performed by FaceGen but optionally giving more control to the user: This way each user might enjoy the process of creating a new face, tailored to his/her wishes [8]. 119 The Dialogue Management System The Dialogue Management System is responsible for the management of user dialogues and for the extraction of the necessary information for book searching. It can be viewed as a client-server application composed mainly by two software modules, communicating through the HTTP protocol. The client side application is a Java applet whose main aim is to let user to type requests in a human-like language and to send these ones to the server side application in order to process them. The other important task it is able to perform is retrieving specific information, based on the responses elaborated by the server-side application, on the World Wide Web through the JavaScript technology. On the server side we have the ALICE Server Engine enclosing all the knowledge and the core system services to process user input. ALICE is an open source chatterbot developed by the ALICE AI Foundation and based on the AIML language (Artificial Intelligence Markup Language), an XML-compliant language that gives us the opportunity to exchange dialogues data through the World Wide Web. ALICE has been fully integrated in SAMIR and all its knowledge has been stored in the AIML files, containing all the patterns matching user input. In order to obtain a system able to let users navigating in a bookshop web site, we wrote some AIML categories finalized to book searching and shopping. Our categories were chosen to cover a very large set of the possible books request: They comprehend the book title, author, publisher, publication date, subject, ISBN code and a general field keyword. Successful examples of book requests for the Amazon bookshop web site are for example, I want a book written by Sepulveda, I am searching for all books written by Henry Miller and published after 1970 or, in alternative forms, Could you find some book written by Fernando Pessoa?, Look for some book whose subject is fantasy. The Behavior Generator The Behavior Generator aims at managing the consistency between the facial expression of the character and the conversation tone. The module is mainly based on Learning Classifier Systems (LCS), a machine learning paradigm introduced by Holland in 1976 [9]. The learning module of SAMIR has been implemented through an XCS [10], a new kind of LCS, which differs in many aspects from the traditional Holland's framework. The most appealing characteristic of this system is that it is strictly related to the Q-learning approach but able to generate task representations which can be more compact than tabular Q-learning [11]. At discrete time intervals, the agent observes a state of the environment, takes an a ction, observes a new state and receives a reward. The basic components of an XCS are: The Performance Component, that, on the ground of the detected state of the environment, selects the better action to be performed, the Reinforcement Component, whose aim is to evaluate the reward to be assigned to the system anf the Discovery Component which, in case of degrading performance, is devoted to the evolution of new, more performing rules. Behavior rules are expressed in the classical format if <condition> then <action>, where <condition> (the state of the environment) combines different conversation tones such as: user salutation, user request formulation to the agent, user compliments/insults to the agent, user permanence in the Web page, while <action> represents the expression that the Animation System displays during user interaction. In particular, the expression is built as a linear combination of a set of fundamental expressions that includes the basic emotion set proposed by Paul Ekman, namely anger, fe ar, disgust, sadness, joy, and surprise [4]. Other emotions and many combinations of emotions have been studied but remain unconfirmed as universally 120 distinguishable. However, the basic set of expressions has been extended in order to include some typical human expressions such as bother, disappointment and satisfaction. In a preliminary experiment, SAMIR has been able to learn some pre-defined rules of behavior and to generalize some new behavioral pattern, updating the initial set of rules [2]. In such a way, SAMIR is comparable with a human assistant that, after a preliminary training, continues to learn new rules of behavior on the ground of experiences and interaction with human customers. Experimental Results In this section we present some experimental results obtained from the interaction between SAMIR and some typical users searching for books about topics like literature, fantasy and horror or for more specific books whose information like title, author and publisher are given. When the user accesses the Web site, SAMIR introduces itself and asks to the user his/her name for user authentication and recognition. In the course of the conversation, if the user asks for a book, SAMIR issues user’s query to the bookshop site. Figure 2 is an example of a more sophisticated query in which the user requests for Henry Miller books published after 1970. In this case the user gives a heavy insult to SAMIR and, consequently, its expression is angry. Figure 3 shows a specific query from a user interested in a list of books by the author Sepulveda. Conclusions In this paper we presented a first prototype of a 3D agent able to support users in searching for books into a Web site. The prototype has been linked to a specific site but we are implementing an improved version that will be able to query several Web bookstores simultaneously and to report, to users, a sort of comparison based on different criteria such as price, delivery times, and so on. Moreover, our work will be aimed to give a more natural behavior to our agent. This can be achieved improving dialogues, and possibly, the text processing capabilities of the ALICE chatterbot, and giving the agent a full proactive behavior: The XCS should be able not only to learn new rules to generate facial expressions but also to modify dialogue rules, to suggest interesting links and to supply an effective help during the site navigation. References 1. 2. 3. 4. 5. Cassell, J., Sullivan, J., Prevost, S., & Churchill, E. (Eds.). (2000). Embodied Conversational Agents. Cambridge:MIT Press. Abbattista, F., Paradiso, A., Semeraro, G., & Zambetta, F. (2002). An agent that learns to support users of a web site. In R. Roy, M. Koeppen, S. Ovaska, T. Furuhashi, & F. Hoffmann (Eds.). Soft Computing and Industry: Recent Applications. (pp. 489-496).Berlin:Springer,. Fleming, B., & Dobbs, D. (1998). Animating Facial Features and Expressions. Charles River Media, Hingham. Ekman, P. (1982). Emotion in the human face. Cambridge:Cambridge University Press. Pandzic, I.S., Life on the Web (2001). Software Focus Journal, 2(2), 52-59. John Wiley & Sons. 121 6. 7. Paradiso, A., Zambetta, F., & Abbattista, F. (2001). Fanky: a tool for animating 3D intelligent agents. In de Antonio, A., Aylett, R., & Ballin, D., (Eds.). Intelligent Virtual Agents. (pp. 242243). Berlin:Springer. Paradiso, A., Nack, F., Fries G., & Schuhmacher, K. (1999). The Design of Expressive Fig. 2. All books by H. Miller published after 1970 8. 9. Cartoons for the Web – Tinky. In Proc. of ICMCS Conference, IEEE Press, 276-281. Sederburg, T.W. (1986). Free-Form Deformation of Solid Geometric Models. Computer Graphics, 20(4), 151-160. Holland, J.H. (1976). Adaptation. In R. Rosen and F.M. Snell (Eds.). Progress in Theoretical Biology , New York: Plenum. 10. Wilson, S.W. (1995). Classifier Fitness based on Accuracy. Evolutionary Computation, 3(2), 149-175. 11. Watkins, C.J.C.H. (1989). Learning from delayed rewards. PhD thesis, University of Cambridge, Psychology Department, 1989. 122 Fig. 3. Results about Sepulveda author 123