CRITICAL PERSPECTIVE TAKING: PROMOTING AND ASSESSING ONLINE WRITTEN ARGUMENTATION FOR DIALOGIC FOCUS

In this article, we consider the impact of classroom instruction and an online argumentation tool (AT) on students’ written argumentation and 21st century skill development. Drawing on a wider study of 1:1 digital schools in Auckland, New Zealand, we examine the three-way relationship between argumentation teaching, student use of an online discussion board, and evidence of perspective taking. Longitudinal data from six elementary schools are analyzed, including 17 observations (in which the teaching focus was a nominated 21st century skill) and 253 student-written posts. Developmental profiles of student argumentation were determined using Kuhn and Crowell’s (2011) taxonomy of argumentation function demonstrating potential for (1) instructional focus and (2) practice or “dosage” effects. Integrated student argumentation profiles acknowledging the benefits of other perspectives were found to co-occur with a higher focus on argumentation instruction, but without increases in students’ critical reasoning. In addition to focus effects, repeated use of the AT suggests that stronger dosages positively influence student perspective integration. The implications for perspective taking and critical thinking through argumentation are discussed in relation to citizenship and resilience in 21st century digital contexts.

Argument has always played a central role in academic disciplines, notably in the scientific community, where ideas and explanations of phenomena that withstand the test of critical examination through argumentative discourse attain consensual acceptance as reliable knowledge (Longino, 1990). Forms of argumentation are now considered an important contributor to the citizenship skills necessary for a digital world in which there is ubiquitous and extensive access to information (and misinformation), increased susceptibility to the insularity of opinions and viewpoints, and an increased need to consider and respond positively to others (McGrew, Ortega, Breakstone, & Wineburg, 2017).
In this article, we consider the impact of classroom instruction and an online argumentation tool (AT) on students' written argumentation. Drawing on a wider study of 1:1 digital schools in Auckland, New Zealand, we examine the three-way relationship between classroom instruction, student use of an online discussion board, and evidence of perspective taking including critical thinking. Kuhn and Crowell (2011) maintain that argumentation follows a developmental pattern from single to dual followed by an integrated profile, the latter occurring as a dialogic focus becomes more proficient. Their approach has roots in sociocultural traditions (Vygotsky, 1987) and more contemporary perspectives (Resnick, Michaels, & O'Connor 2010;Reznitskaya, Anderson, McNurlen, Nguyen-Jahiel, Archodidou, & Kim, 2001) in which everyday argumentation as a social practice is viewed as the basis for developing argumentative thinking and writing. In Vygotskian terms, intermental practice becomes internalized and represented as intramental. Improvement in the quality of thinking is also conceived in dialogic terms in which the argumentation process shifts from a single, support-own position to the 121 recognition of weakness in other positions and reasoning (or dual focus). Further skill progression involves acknowledgment that what others believe or say may have strengths and weaknesses, as may one's own views. This more complex process of combining and weighing alternate viewpoints signifies integration, and the pathway to improved, critically evaluated decision making and belief. Counter-arguments and taking into account "the framework of alternatives" are essential to the development of dialogic focus (Kuhn, Hemberger, & Khait, 2014, p. 43).

A Social-Developmental Perspective of Argumentation
The role of critical perspective-taking Developing expertise in argumentation is multifaceted (Kuhn, Zilmer, Crowell, & Zavala, 2013), extending beyond narrowly defined cognitive skills (e.g. producing a claim supported by evidence) to dialogic and metacognitive dimensions. A critical, dialogic focus implies both perspective taking and critical thinking. Perspective taking entails aspects of cognitive and emotional empathy (Cohen & Strayer, 1996): the ability to understand the views and mental states of others (cognitive) and the capacity to internally simulate and experience the emotions of others (affective). Critical thinking, on the other hand, is inherently evaluatory: rational skepticism and systematic reasoning inform judgements of what to believe or do (Ennis, 1996). The two sub-skills may seem somewhat contradictory, but scientific arguments containing rebuttals are considered to be of the highest quality when they "compare, contrast, and distinguish different lines of reasoning" (Osborne, 2010, p. 464). If the ability to think critically involves analyzing and evaluating multiple perspectives on a complex issue or question, then open-mindedness to alternatives necessitates "seeing through other eyes." Evidence for building argumentation skill-sets in a community of learners has been demonstrated in subject areas such as science (Larrain, Freire, López, & Grau, 2019;Rapanta, Garcia-Mila, & Gilabert, 2013) and English language arts (Brown, 2016). The instructional designs tend to focus on interpersonal and intrapersonal skills for engaging effectively in the practices of face-to-face communities. Less is understood about how argumentation impacts learning through complex cognitive and social skills in online environments including the development of intra-dialogic focus. For example, Kuhn and colleagues (2014) make the distinction between skills for individual argumentative writing and the dialogic argumentation between individuals "personified by a flesh-and-blood other." Face-to-face argumentative discourse approaches have been shown to be a productive means for developing argumentative competences in school-aged students because of the close connection between argument as product and process, as well as because of the developmental origins in everyday talk. In digital environments, particularly when communication is carried out asynchronously (e.g. blogs, discussion 122 boards), dialogic argumentation is arguably more of a challenge due to the absence of person-to-person cues and the need to imagine a "missing interlocutor" (Graff, 2003). Therefore, online contexts can be said to require new sorts of skills in argumentation that cultivate dialogic focus, particularly as the Internet has become a defining technology for literacy.

Online reading and argumentation
Critical "reading" of online information requires engaging in complex ways with multiple texts ( Jesson, McNaughton, Rosedale, Zhu, & Cockle, 2018;Wilson & Jesson, 2019). This need is recognized in various curricula and national standards (ACARA, n.d.;NGA, 2010). Researchers of online interactions with text (including multimodality and hypertext) suggest an expanded understanding of what counts as strategic reading and there are no "counterparts in traditional reading" to the new needs for resolving meta-representations of multiple texts including evaluations of credibility, trustworthiness, selective bias, and reasonableness (Afflerbach & Cho, 2010, p. 209). A recent factor analytic study of online comprehension using a multi-text set of divergent perspectives underlines this. Factors included two different types of evaluation to judge credibility and two different types of synthesis for single and multiple online texts (Kiili et. al., 2018).
Despite this increasing need for criticality in literacy and reasoning, of being able to judge such things as the trustworthiness and reasonableness of one's own texts as well as texts of others, students in both the US (McGrew, Ortega, Breakstone, & Wineburg, 2017) and the UK (National Literacy Trust, 2018) have been found to have low levels of critical literacy or civic online reasoning skills. The limited data available indicate a similar picture in New Zealand ( Jesson, McNaughton, Rosedale, Zhu, & Cockle, 2018). Directional motivated reasoning, in which confirmation and disconfirmation biases filter what is considered relevant is common and the evidence suggests specific courses in critical literacy are needed (Kahne & Bowyer, 2018).

Teaching skills of dialogic argumentation
Much is known about teaching argumentation from science-based fields (Song, Deane, Graf, & van Rijn, 2013). Despite this, the absence of argumentation in classrooms is well-documented (Norris, Phillips, Smith, Guilbert, Stange, Baker, & Weber, 2008;Osborne, 2010). Critical features of activities affording student argumentation in science classrooms include: contrasting intuitive or outdated models with new ideas, routine opportunities to justify and challenge ideas, and engagement in the cognitive processes of comparing and contrasting as tested by rebuttal and counter-argument (Osborne, 2010). That research highlights the need for more direct approaches to developing, rather than relying on content learning (Rapanta, Garcia-Mila, & Gilabert, 2013). More than this, the emerging view is of the central role of the discourse of argumentation (Kuhn et al., 2013).
Dialogue-intensive pedagogy has been found to contribute to valued student outcomes such as comprehension, perhaps better than other instructional designs (Wilkinson & Son, 2010;Reznitskaya et al., 2001). In one form, collaborative reasoning focuses on learning in the process of dialogic argumentation and development of argument schema. The dialogic focus can be said to go beyond adversarial and coalescent forms of argumentation because positions are modified as an outcome of the dialogue.
Instructional conditions and dialogic processes have been studied in face-to-face classroom contexts. But parallels exist in digital contexts. For example, Saltarelli and Roseth (2014) have shown that cooperation can be enhanced in a digital version of "constructive controversy," a cooperative learning procedure involving argumentation aimed at reaching and raising awareness of an integrated position. Kuhn and Crowell (2011) report dialogic argumentation with middle schoolers using online instant messaging, reporting improved direct counterargument or persuasion over three years. One benefit of instant message applications was the ability to engage reflectively with transcripts of the written exchanges. A possible problem was that critical discourse features of persuasive argument such as undermining an opponent's position are "un-dialogic" in the sense that persuasion potentially socializes inflexible thinking and may inadvertently discourage perspective taking (see Kuhn, Zillmer, Crowell, & Zavala, 2013, p. 458).
Overall, it appears that dialogic approaches can be promoted online.
We have reported on patterns of argumentation and classroom instruction in schools with 1:1 devices serving Māori (indigenous) and Pasifika (originating in the Pacific Islands) students from low socioeconomic status communities (Rosedale, McNaughton, Jesson, Zhu, & Oldehaver, 2019). We used a digital platform based AT and observed classroom instructional foci. Three patterns emerged. The elementary school students mostly used a single perspective (their own) when arguing; low rates of either a dual perspective (including the critique of other perspectives) or an integrated perspective (including the positive appraisal of other perspectives and critique of one's own) were associated with low rates of instructional foci on argumentation in these digital classrooms. However, relationships were found between argumentation-focused instruction and the development of an integrated perspective in students' written posts of their arguments.
Teachers in that study only used the tool once, which leaves open the question of the effects of repeated use of tools to enhance argumentation. Also, analyses relating to the issue of criticality and perspective taking were not undertaken in classrooms in which there was a higher instructional focus on argumentation. The issue of whether criticality could be enhanced through CRITICAL PERSPECTIVE TAKING: PROMOTING AND ASSESSING ONLINE... an increased classroom focus and with changes in patterns of argumentation was not examined.
In this study, our partners were a new sample of teachers who used the AT on repeated occasions and co-designed instruction using the tool rather than implement an existing program. Such research practice partnerships (Snow, 2015) are particularly appropriate for solving instructional challenges "on the ground" and where embedding practices and sustaining them are longer term objectives. As noted earlier, there are gaps in our knowledge about the features of instruction needed to promote higher level (integrated) forms of argumentation, especially in students' criticality. Although previous studies have established effects on perspective integration in student writing such as counter-argument and rebuttal, we add here analyses of links between teacher aspects of instruction, the multi-perspective affordances of an online tool, and student critical perspective taking.
Three research questions were addressed: (1) What developmentally significant changes in argumentation profiles were found to be associated with the co-design involving repeated use of the argumentation tool (AT)?
(2) What changes in instructional foci were associated with the co-design and repeated use? (3) Were there changes in student criticality in argumentation and were these associated with changes in instructional focus?
Based on previous studies (Rosedale et. al. 2019), we predicted that teachers in the partnership process plus the repeated use of the AT would be associated with changes in instructional focus of teachers and in the argumentation profiles of students. We predicted that it would be difficult for students to develop criticality in their argumentation given Kuhn, Hemberger and Khait's (2014) finding that the higher order argumentation functions are a rare phenomenon, considered developmentally challenging even for adults. The exact level of difficulty was unclear in the context of the instructional co-design, repeated use of the AT, and the associated instructional foci.

Methods
This study was part of a wider Developing in Digital Worlds 1 project investigating the development of 21st century skills in digital classrooms in New Zealand. We designed and tested an argumentation tool (AT) as both an instructional resource and assessment instrument. The four-year project was undertaken in clusters of elementary and upper elementary schools (n=16) primarily serving Māori (indigenous) and Pasifika (from Pacific Islands) families from low socioeconomic status communities.
The schools are supported by an educational trust that enables students to have 1:1 digital devices and in which the affordances of a shared digital pedagogy are key components of a school improvement initiative. They have been engaged in a design-based research partnership over several years. Major improvements have occurred in student achievement outcomes, notably writing, but also reading and mathematics, associated with specific properties of digital pedagogy ( Jesson, McNaughton, Wilson, Zhu, & Cockle, 2018).
Argumentation was included as one of a set of cognitive and social skills considered critical for future student success. Early in the project, it was identified as the skill with the least frequent uptake as revealed in classroom observations. The AT required students to read multiple texts with different perspectives online and to post written arguments to a discussion board. The posts were used to evaluate dialogic focus using an established taxonomy and to enable links with classroom instruction.
As the schools use Google Suite applications for everyday learning in classrooms, it was important that the AT leverage current expertise and build on established tool affordances and pedagogical approaches in the schools. The longstanding research partnership provided a mechanism for the AT to be augmented by a co-design process with teachers and leaders in schools. A specific instructional design process occurred over 18 months.

Participants
The volunteer teachers and students were members of three clusters of predominantly low-decile schools (1-3) in the North Island of New Zealand. The Ministry of Education classifies the lowest deciles as schools with the highest proportion of students from low socioeconomic communities. In this article, we confine our attention to students from classrooms (n=24) whose teachers (n=12) used the AT twice and who had been part of the instructional design process in their schools. Seven teachers were lower elementary teachers (grades 3 and 4) and five taught upper elementary (grades 5-8). The students in their classrooms were different at the two time points.
To answer question one and three below, written discussion board posts (n=253) were collected from 253 students in 12 classrooms in which two alternate forms of the AT were used (n=105 at Time 1 and n=148 at Time 2). The students ranged in age from 7 years old (3rd grade) to 13 years old (8th grade). Age-appropriate versions of the AT tool were used for lower elementary and upper elementary. Fifty-four percent of the students were girls. More than half of the students were Pasifika (58%), 26% were Māori, and the remaining 16% were of other ethnic groups including European and Asian.
At both time points, teachers and their classrooms were sampled for observation as part of the wider project, and these observations provided the data to answer question two below. The sampling resulted in 6 of the 12 teachers involved in the co-design being observed at both time points and the observations from these classrooms were the basis for judgements about changes in instruction.

The Argumentation Tool (AT)
The AT 2 was developed as a hyperlinked design with three Google applications to sequence the argumentation activity phases: Phase 1 -Provocation (Google Slides) introduced a contentious claim with hyperlinks to subsequent task phases; Phase 2 -Evidence sheet (Google Docs) provided short excerpts representing conflicting viewpoints in support or against the provocation from different media sources; Phase 3 -Discussion board (Google Groups) offered a means of online written response to the provocation.
Students were asked to adopt an independent but dialogic focus in response to the provocation about an environmental issue in New Zealand. A dialogic focus was encouraged in three ways: (a) the discussion board topic represented the claim of a would-be discussant (e.g. environmentalist); (b) the evidence sheet comprised conflicting, relevant viewpoints and evidence; (c) the students were instructed to "make sure you think about other people's opinions when you reply." This first version of the AT was considered highly topical for young people at the time, featuring an event involving a visiting celebrity and a local production crew filming a music video. The beach location includes a protected area that is home to native dotterels, an endangered bird species, and is protected by the New Zealand Department of Conservation (DOC). Use of the beach is regulated by permit. The film company were reported to have disregarded regulations by transporting personnel (including the celebrity) and equipment in 12 vehicles instead of the mandated two.
The second version of the AT featured another environmental issue. The Government's Predator Free 2020 campaign promotes eradication of all introduced animal and insect pests (e.g. rats, possums) considered a threat to native wildlife. Pests are catalogued with illustrations on the DOC website. Extreme Predator Free groups are calling for interim curfews for felines and long-term ban even of pets, while noted biologists are advocating caution with regard to impacts on food chains. Guidelines given to teachers included reading through the slide instructions with the whole class, answering any questions, and being on hand to give support with reading comprehension. A copy was shared with students on the class website or via email. Teachers were also requested to share with the rest of the class any questions raised in one-on-one interactions, so that all students benefitted collectively from any advice.
After reading and reviewing the evidence sheet, students were given thirty minutes to write and post their response to the discussion board. During this time, they were unable to view any of the other responses as Google Groups provides a separate window instance during drafting, and the discussion thread is not populated until after the student clicks the "Post" option.
The evidence sheet bundled together excerpts from different media sources about the event. These provided a balance of confirmatory and conflicting evidence. The primary source could be accessed with an online search.
Students' written responses were supported by prompts including: "Start by clearly saying -I agree, disagree, or partly agree/disagree; remember that your post will be read by people around the world; make sure you think about other people's opinions when you reply; give good reasons and think logically." They were also encouraged to inquire beyond the evidence provided and to "use other sites or information on the Internet to help you." Students were not permitted to edit their contribution once posted to align with the assessment purposes of the research. Teachers were encouraged to invite students to respond to each other's responses in a subsequent lesson, once the data had been collected.

The instructional design process
The teachers and their school leaders were part of a wider research-practice partnership that employed rigorous design-based research methodology to test effectiveness and redesign pedagogy to be increasingly effective (see Jesson, McNaughton, Wilson, Zhu, & Cockle, 2018). School principals and teachers were used to participating in co-design as a form of intervention. Following the early stages of the project, educators and researchers agreed on the need to increase awareness of argumentation as a valued educational outcome. The instructional design process used teacher and student data as well as research literature to establish properties and components of argumentation and relationships with classroom instruction and instructional designs that would promote argumentation.
The process was situated within online and face-to-face professional learning communities comprising small groups of classroom teachers, middle school leaders (syndicate leaders), and program administrators in schools (four strands across all the schools) working alongside researchers.
Up to five sessions (depending on school timetables) were led by two or three researchers; one researcher (Greenleaf) participated remotely online in some sessions with one group of schools or as archived and replayed in others. Initial sessions introduced concepts of argumentation and critical reasoning and their developmental features; reviewed classroom observations data (teacher and student); introduced AT data on features of argumentation using Kuhn, Hemberger and Khait's (2014) framework; provided initial resources; used small group tasks to look at bias and how to draw students' attention to the importance of identifying or questioning disconfirming evidence; and finally introduced a task to design and trial a discussion board approach in their own contexts.
The sessions focused on different aspects of argumentation, collaborative reasoning, and pedagogy. There was a focus in these on making student thinking processes visible, socialized, and generalizable through what was described as "going meta" (using strategies for promoting self-reflection) and through specific provocations in science texts and with "fake news" inquiry examples. Across sessions, participants worked together to explore how to build skills, norms, values, and practices that would promote argumentation. Discussions of pedagogy included use of research-informed specific prompts (e.g. "my evidence is…and it can be trusted?" or "some people may say…", Reznitskaya et al., 2009).

Question 1: developmental changes in student argumentation
To answer question one, the written posts were downloaded and each post segmented into idea units (essentially a statement that carries a single claim supported by a reason). Each idea unit was then coded into one of four categories of functions from Kuhn and Crowell's (2011) Table 1. Two codes were taken as reflecting criticality: identifying weakness in one's own (M-) or other's reasoning (O-).
A developmental profile for each students' blog post was then created. A profile could either be single (limited to only M+ idea units), dual (includes both M+ and O-idea units), or integrated (includes at least one integrated idea unit -O+ or M-).  Table 1). The students' written posts were then all coded by one coder, after which 20% randomly chosen posts were coded by a second coder. The agreement percentage was 92%.
There are two base units in subsequent descriptive and statistical analysis: by idea unit or by post. Frequencies and percentages of argumentation functions were calculated by idea units and frequencies and percentages of argumentation profiles were aggregated by post.

Question 2: changes in instructional focus
To answer question two, classrooms were observed before and after data collection using the AT. Observations were divided into repeated three-minute intervals, structured to alternate between close observation of teachers and observations of students working independently. Each classroom observation totaled 48 minutes (12 intervals). Within the observation schedule, each interval was coded for the presence of argumentation sub-skills such as a claim, warrant, evidence, and conclusion. 3 The observations were carried out as part of the wider project focused on several social and cognitive skills. Because of this we could also check observers' records for intervals coded as showing a focus on critical thinking and the presence of the sub-skills of critical thinking, defined as instances in which students or teachers justified, evaluated, reflected on, or critiqued thinking.
At the end of each three-minute interval, the observer recorded field notes for one minute after the teacher focus (three minutes) and student focus (three minutes) for a total of eight-minute blocks. We analyzed 72 intervals at Time 1 and Time 2. Inter-rater reliability using two observers was above 90% at each time point.
The intervals of classroom observations were coded for a focus on argumentation sub-skills such as a claim, warrant, evidence, and conclusion by the same six teachers from Time 1 to Time 2. These were used to identify classrooms in which the teacher demonstrated a relatively high explicit focus on the sub-skills of argumentation (a focus on any one sub-skill was explicitly observed in at least one interval); or a low focus on argumentation sub-skills (no explicit focus was observed).
The observations also provided data for judging the instructional focus on criticality through the coding of intervals for critical thinking. The subcategories used in the observation schedule included one in which the teacher or a student provided a critique of a claim or position.

Question 3: changes in criticality
The coding of functions in the blog posts also provided data on criticality in students' argumentation. Criticality is a component of argumentation and two functions were identified that entailed some form of being critical: O-(critiques the other's alternate position) and M-(acknowledges weakness in one's own position). See definitions and examples in Table 1.
At Time 1, the total number of intervals in which some form of critical thinking was recorded was similar to argumentation (n=14). This total number changed minimally from Time 1 to Time 2 (n=15), although the component most often observed did vary, between making evaluative comments (Time 1) and providing justifications (Time 2). The low number of intervals coded for offering a critique (e.g. "Can anyone explain to us whether Cara's work needs improving or not?") was notable; in these classrooms, there were no instances of providing credible reasons (e.g. "How is it that by retesting, we have increased the credibility of our results?").

Question 1: changes in student argumentation
We analyzed 740 idea units from 253 posts (287 at Time 1; 453 at Time 2). Idea units were classified into their argumentation functions: single focus, dual focus, or integrated focus. These functions were used to obtain a profile of the overall student blog posts (see Tables 2 and 3).

132
As we have previously found in a different sample from these schools (Rosedale et al., 2019), the developmentally early (single) focus predominated, both in terms of idea units and in terms of individual profiles from the overall blog post. However, both single focus and dual focus idea units decreased between Time 1 and Time 2 (by 5% and 2% respectively) and more advanced idea units with an integrated function increased by 7%. Chi-squared test of independence indicated that these changes in the distributions of ideas units were statistically significant (chi-square statistic = 6.51, df=2, p-value < .05).
Question 2: changes in instructional focus. Overall changes Tables 4 and 5 present the observation data from the same six teachers' classrooms at Time 1 and Time 2. At Time 1, 15% of the intervals were coded as having a teacher or student focus on argumentation, and these occurred in only two of the six teachers' classrooms (these were subsequently identified as Hi focus classrooms). The observed intervals with an argumentation focus more than doubled at Time 2 to 42% of the intervals, and these were recorded in four teachers' classrooms (identified as Hi focus classrooms). At Time 1, the component of argumentation that was most frequent was making claims often in the form of a teacher question (e.g. "Using the Transport Agency percentages about the safety of the different models, what claims could Mazda make to their customers?") At Time 2, more intervals of making claims were observed, together with an increase in providing evidence, again from the teacher as questions (e.g. "What evidence do the park rangers give that actually backs up their claim of climate change?"). Providing a warrant ("What would we need to argue to show Mojo's explanation to the judge in the story supports his mother's claim of innocence?") occurred with moderate frequency at both times, but concluding (e.g. "From this data on children's TV programs, what possible conclusions could you draw?") was consistently infrequent.  Hi and Lo Focus Classrooms The percentages of integrated idea units from students in classrooms designated either as Hi or Lo focus at Time 1 and Time 2 are shown in Figure  1. Table 6 summarizes chi-squared test results. The percentage of integrated idea units improved for both Hi and Lo focus classrooms: in Lo focus classrooms, from 19% to 24% at Time 2; in Hi focus classrooms, from 18% to 40%. At Time 1, the Hi and Lo teaching focus was not associated with different percentages of posts with an integrated profile (chi-squared test statistic = 0.004, df=1, P-value > .05). However, at Time 2 the students from Hi focus classrooms on average had 2.11 times greater odds of forming an integrated perspective than students from the Lo focus classrooms (chi-squared test statistic = 4.06, df=1, P-value < .05 with confidence intervals of [1.01, 4.37]). The estimated odds ratio in Time 2 was similar to those reported in an earlier study by Rosedale et al. (2019).

Question 3: changes in criticality
The developmental changes described above in idea units with an integrated focus and in integrated profiles were exclusively the result of increases in the proportion of idea units that recognized the worth of the alternative position (O+), rather than any form of being critical. There were only eight idea units at Time 1 and three at Time 2 that had the function of criticizing one's own position (M-). The other idea unit that reflected criticality was critiquing the other's position (coded as contributing to a dual focus). O-idea units occured slightly more frequently than M-at both time points but there were no increases over time; at Time 1 the percentage of idea units of O-was 9.8% (n=28 units) and at Time 2 was 8.2% (n=37).
The observations recorded intervals of critical thinking and also coded sub-categories (e.g. justify, evaluate/reflect, credibility, critique, other). At Time 1, the total number of intervals in which some form of critical thinking was recorded was similar to argumentation (n=14). This total number changed minimally from Time 1 to Time 2 (n=15), although the component most often observed did vary, between making evaluative comments (Time 1) and providing justifications (Time 2). What is noticeable is the low number of intervals coded for offering a critique (e.g. "Can anyone explain to us whether Cara's work needs improving or not?"); these classrooms had no instances of providing credible reasons (e.g. "How is it by retesting, we have increased the credibility of our results?").

Discussion
This study sought to answer three questions: (1) What developmentally significant changes in argumentation profiles were found to be associated with the co-design involving repeated use of the argumentation tool (AT)?
(2) What changes in instructional foci were associated with the co-design and repeated use? (3) Were there changes in student criticality in argumentation and were these associated with changes in instructional focus?
Developmental changes in students' online argumentation skills This study adds to others showing that instruction and tools can be incorporated in classrooms that promote advanced forms of argumentation, despite the evidence that the advanced integrated forms are a rare phenomenon and developmentally challenging, even for adults (Kuhn, Hemberger, & Khait, 2014). This study extended the evidence base to include classrooms in ubiquitous digital environments using a digital tool as the platform for argumentation. We found evidence that developmentally significant changes in argumentation of students aged between 7 years and 13 years were associated with the co-design process.
It is assumed that a stage-like developmental progression in the quality of argumentation occurs from a single to dual profile as a basis for integrated functions (e.g. Kuhn, Hemberger, & Khait, 2014). In previous descriptions from these schools, growth curves did not indicate a clear stage-like developmental progression from single to dual to integrated argumentation (Rosedale et al., 2019). In the current study data from two cohorts of elementary school students across a six-year age range also did not show a clear stage-like pattern. The pattern at Time 2 was more bimodal with little change in percentages of dual function idea units or a dual profile. In this New Zealand context, patterns of responding seemed able to shift from single to integrated within classrooms. However, a limitation is that these are aggregated data and not individual developmental data. It would be necessary to examine longitudinal patterns of students in this digital context to confirm the nature of possible developmental progressions.
Despite the developmental change towards integrated functions of argumentation, the shift clearly was difficult for many students. Fewer than half of the students had integrated profiles at Time 2. Also, there was still a predominance of holding one's own position or a single stance reflecting what others found (Kuhn & Crowell, 2011).
Why are the integrated functions so difficult? Kuhn and Crowell (2011) maintain that integration is mainly of two forms: acknowledging the benefits of another perspective or challenging the flaws or weaknesses in one's own perspective. Each places considerable demand on cognitive capabilities, especially the maturity of self-reflection, and being able to overcome the pervasiveness of both negative and positive confirmation bias in motivated reasoning (Kahne & Bowyer, 2018). These are challenges for younger children, and even young adolescents have been found to concentrate attention on the exposition of their own claims, ignoring alternative perspectives and disconfirming evidence (Kuhn, 2001).
In addition to cognitive load and motivated reasoning explanations, there is also a curriculum and pedagogical explanation. Curricula can be considered as channels for socializing skills and knowledge through enacted practices (McNaughton, 1994). The New Zealand curriculum identifies persuasion ("writing text to influence others") as a key focus of progress in writing 4 . Since 2003, persuasion has been one of six, and more recently only five, writing genres assessed by elementary teachers with the standard tool used in elementary schools 5 . In addition, our evidence from this and other studies is that argumentation is both misunderstood (often as persuasion, debate, or forming consensus) and seldom focused on in classrooms; together with curriculum and assessment, this indicates practices that are in conflict with argumentation practices as conceived by Kuhn and others (2014).

Instructional focus and co-design
The developmentally significant changes were associated with a collaborative instructional design process involving both researchers and teachers. That process was relatively light in terms of face-to-face or digital sessions with leaders and teachers. But it used features known to be effective in school change; notably, it was based in the teachers' own practices and used both student and teacher data to establish an educationally significant problem for the teachers to solve in collaboration with the researchers (Brown & Poortman, 2018).
Even given only one lesson sample of classroom instruction, from just six teachers at two time points, there were indications that the instructional focus did change significantly and that change was associated with changes in students' argumentation functions, consistent with the instructional design. The evidence from the Hi and Lo classroom analysis suggests that impacts are likely to be greater as more instructional time is devoted and when more components of argumentation are focused on.
The AT tool, used twice, was appreciated by the teachers as something they could appropriate for further design. Teachers reported they would continue to use the framework and populate it with new "provocations" and new bundled on-line resources to promote argumentation. Given that commentaries on digital tools are often focused on affordances for students, we argue that it is important also to consider the role of affordances for teachers.

Criticality
Criticality in argumentation proved especially difficult to enhance. While being critical of the other's position occurred relatively frequently, the proportion did not change over time. Even more noticeable were the low levels of self-critique at Time 1 and the lack of any change. These low levels and lack of change were found despite the overall changed instructional foci and the repeated uses of the tool by the teachers. Why was being able to identify weaknesses or flaws in one's own reasoning even more difficult than being able to acknowledge the worth of the other's position?
One reason was that despite an increased focus on argumentation across classrooms seen especially in the components of making claims and providing evidence, an instructional focus on criticality was infrequently observed if at all. In addition, wider schooling norms constraining critical forms of student engagement also may be at play. Low levels of critical literacy or civic online reasoning skills have been reported in the United Kingdom and the United States (McGrew, Ortega, Breakstone, & Wineburg, 2017;National Literacy Trust, 2018), and the indications are that that is likely true in New Zealand ( Jesson, McNaughton, Rosedale, Zhu, & Cockle, 2018).
It is widely accepted that argumentation competence requires developing shared sets of values, standards, and discourse features that shape behaviors over time (Kuhn, Wang, & Li, 2010). Developing students' commitment to what Michaels, O'Connor and Resnick (2008) term "accountable talk" requires shared standards of knowing, valuing critique, and critical perspectivetaking as essential components of knowledge building (whether online or face-to-face). These norms need to be socialized within a community of learners as "ground rules" (Mercer, 1996. p. 363) for online discourse.

Limitations
A limitation of the present study is the weak nature of the intervention design. Further research with the tool should more systematically test effects using robust quasi-experimental methods. This study does not demonstrate the scalability of the tool and the co-design process. Our working hypothesis for further studies is that developing criticality of argumentation will take quite specific, deliberate, and broad promotion within dialogic activities (Osborne, 2010) embedded in classroom communities that have shared values and norms, as well as the necessary discourse expertise. In turn, teachers will need particular forms of "adaptive expertise" to construct communities with these features. This is not a quick or easy task. A number of interventions have shown that intensive and extended multi-component professional development is needed to develop teaching for argumentation (Hennessy, Mercer, & Warwick, 2011;Wilkinson et al., 2017).