Preview only show first 10 pages with watermark. For full document please download

Institutionen För Systemteknik Department Of Electrical Engineering Sata Controller Using Virtex-5

   EMBED


Share

Transcript

Institutionen för systemteknik Department of Electrical Engineering Examensarbete Bridging of SCSI to SATA and Implementation of a SATA Controller using Virtex-5 Examensarbete utfört i Datorteknik vid Tekniska högskolan i Linköping av Erik Landström LITH-ISY-EX--09/4228--SE Linköping 2009 Department of Electrical Engineering Linköpings universitet SE-581 83 Linköping, Sweden Linköpings tekniska högskola Linköpings universitet 581 83 Linköping Bridging of SCSI to SATA and Implementation of a SATA Controller using Virtex-5 Examensarbete utfört i Datorteknik vid Tekniska högskolan i Linköping av Erik Landström LITH-ISY-EX--09/4228--SE Handledare: Dr. Daniel Wiklund Sectra Communications AB Examinator: Lecturer Olle Seger isy, Linköpings universitet Linköping, 20 February, 2009 Upphovsrätt Detta dokument hålls tillgängligt på Internet — eller dess framtida ersättare — under 25 år från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/ Copyright The publishers will keep this document online on the Internet — or its possible replacement — for a period of 25 years from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for his/her own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its www home page: http://www.ep.liu.se/ c Erik Landström Avdelning, Institution Division, Department Datum Date Division of Computer Engineering Department of Electrical Engineering Linköpings universitet SE-581 83 Linköping, Sweden Språk Language Rapporttyp Report category ISBN  Svenska/Swedish  Licentiatavhandling ISRN  Engelska/English    Examensarbete  C-uppsats  D-uppsats   Övrig rapport 2009-02-20 — LITH-ISY-EX--09/4228--SE Serietitel och serienummer ISSN Title of series, numbering —  URL för elektronisk version http://www.da.isy.liu.se http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-16784 Titel Title Bryggning mellan SCSI och SATA samt implementering av en styrenhet för SATA på en Virtex-5 Bridging of SCSI to SATA and Implementation of a SATA Controller using Virtex5 Författare Erik Landström Author Sammanfattning Abstract Companies and authorities of today often handle large amount of data, not unusually with a restricted content which should be kept secret from outsiders. One way of accomplish this is to encrypt stored data in real time. For this a hardware solution is ideal since it can be independent, fast enough, and easily added to already existing systems. This report is a starting point to achieve this with two of the most common mass storage standards SATA and SCSI in focus. It is based on the task to develop a FPGA based SATA controller and investigate the possibility to ”speak” SCSI with SATA devices. The working process has involved theoretical studies, system design, test driven development using simulations and hardware tests and technical investigation. The thesis resulted in a SCSI-to-SATA translation investigation pointing out difficulties and presenting a translation model. A SATA host was also implemented in VHDL on a Virtex-5 FPGA that can execute a number of SATA commands on different devices. Simulations performed shows that the total latency reaches one µs/32 bits in the SATA host and that should not be much of a problem for most applications in a possible bridge solution. Nyckelord Keywords SATA, SCSI, FPGA, bridging, protocol Abstract Companies and authorities of today often handle large amount of data, not unusually with a restricted content which should be kept secret from outsiders. One way of accomplish this is to encrypt stored data in real time. For this a hardware solution is ideal since it can be independent, fast enough, and easily added to already existing systems. This report is a starting point to achieve this with two of the most common mass storage standards SATA and SCSI in focus. It is based on the task to develop a FPGA based SATA controller and investigate the possibility to ”speak” SCSI with SATA devices. The working process has involved theoretical studies, system design, test driven development using simulations and hardware tests and technical investigation. The thesis resulted in a SCSI-to-SATA translation investigation pointing out difficulties and presenting a translation model. A SATA host was also implemented in VHDL on a Virtex-5 FPGA that can execute a number of SATA commands on different devices. Simulations performed shows that the total latency reaches one µs/32 bits in the SATA host and that should not be much of a problem for most applications in a possible bridge solution. vii Sammanfattning Dagens företag och myndigheter hanterar stora mängder data, data som inte ovanligt bör hållas hemlig för utomstående. Ett sätt att uppnå detta på är att kryptera all lagrad data i realtid. För detta är en hårdvarulösning ideal eftersom den kan vara oberoende av omgivande system, tillräckligt snabb och anslutas till ett redan befintligt system. Den här rapporten är en startpunkt att uppnå detta med två av de vanligaste standarderna för lagring av stora mängder data, SATA och SCSI, i fokus. Rapporten grundar sig på uppgiften att utveckla en FPGA-baserad styrenhet för SATA och att utreda möjligheten att ”prata” SCSI med SATA-enheter. Arbetet har innefattat teoretiska studier, systemdesign, testdriven utveckling med hjälp av simuleringar och testning på hårdvara samt teknisk utredning. Examensarbetet reslterade i en utredning om möjligheterna att översätta SCSI till SATA. Utredningen gav en modell för hur översättning kan gå till samt poängterar svårigheter som där kan uppkomma. En styrenhet för SATA har också blivit implementerad i VHDL och körs på en Virtex-5 FPGA som kan exekuvera ett antal SATA kommandon. Styrenheten är testad mot en rad olika SATA-enheter. Simuleringar som utförts har visat att fördröjningen uppgår till en µs/32 bitar i systemet vilket inte bör utgöra några problem i en eventell brygga för de flesta användningsfallen. ix Acknowledgments First I’d like to thank my co-workers at Sectra Communications for all help and advices. And special thanks to: Dr. Daniel Wiklund, my supervisor, for all help and support during the thesis. Dr. Michael Bertilsson for the opportunity to carry out the thesis. Fredrik Johansson for technical support, proof-reading and help with the report’s layout. Thanks to Andrew Baldman, Senior Technical Staff, at UNH InterOperability Lab for guidance during SATA-protocol issues. Also thanks to friends and family supporting me no matter what stupidity I’m up to. Finally, thanks to Nina, my source of inspiration and life companion, love you! Erik Landström, Linköping, 2009 xi Contents 1 Introduction 1.1 Background . . . . . . . . . . 1.2 The Task . . . . . . . . . . . 1.3 Reading Recommendations . 1.4 Content Briefing . . . . . . . 1.5 Definitions and Abbreviations 1.6 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 2 3 4 6 2 Feasibility Study 2.1 SATA . . . . . . . . . . . . . . . . . . . . 2.1.1 Dword - Data Representation . . . 2.1.2 Primitives . . . . . . . . . . . . . . 2.1.3 8b/10b - Encoding . . . . . . . . . 2.1.4 Out of Band Signaling . . . . . . . 2.1.5 Physical Layer . . . . . . . . . . . 2.1.6 Link Layer . . . . . . . . . . . . . 2.1.7 Transport Layer . . . . . . . . . . 2.1.8 Application Layer . . . . . . . . . 2.2 SCSI . . . . . . . . . . . . . . . . . . . . . 2.2.1 Command Descriptor Block . . . . 2.2.2 Status, Sense and Error Reporting 2.2.3 Primary Command Set . . . . . . 2.3 Virtex-5 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 8 8 8 9 10 11 12 13 14 16 16 16 19 3 SATA Design and Implementation on Virtex-5 3.1 System Overview . . . . . . . . . . . . . . . . . . . . 3.2 Physical Layer Design . . . . . . . . . . . . . . . . . 3.3 Link Layer Design . . . . . . . . . . . . . . . . . . . 3.3.1 Link Layer, Idle State Machine . . . . . . . . 3.3.2 CRC, Scrambling, and Primitive Suppression 3.3.3 Link Layer Transceiver Block . . . . . . . . . 3.3.4 Link Layer Receiver Block . . . . . . . . . . . 3.4 Transport Layer Design . . . . . . . . . . . . . . . . 3.4.1 Top Level and Idle State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 21 22 23 24 24 26 26 27 27 . . . . . . . . . . . . xiii . . . . . . . . . . . . . . . . . . . . . . . . xiv 3.5 3.6 3.7 3.8 Contents 3.4.2 Transport Layer, Shadow Register Interface 3.4.3 Transport Layer, DMA Interface . . . . . . Application Layer Design . . . . . . . . . . . . . . 3.5.1 Application Layer, Top Level . . . . . . . . 3.5.2 The Command Layer . . . . . . . . . . . . . 3.5.3 Shadow Registers’ Design . . . . . . . . . . Simulations . . . . . . . . . . . . . . . . . . . . . . Hardware Tests . . . . . . . . . . . . . . . . . . . . 3.7.1 HW-test Part I, Start-up . . . . . . . . . . 3.7.2 HW-test Part II, Write and Read . . . . . . SATA to SATA Bridging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 28 28 28 30 30 31 32 33 33 34 4 Translating SCSI- into SATA-Commands 4.1 Command Translation Model . . . . . . . . . . . 4.1.1 Support for Queued Commands . . . . . . 4.1.2 Translating LBA . . . . . . . . . . . . . . 4.1.3 Translating SCSI Control Byte . . . . . . 4.1.4 Translation of the Primary Command Set 4.2 SCSI / Serial ATA Translation Layer - SSATL . 4.2.1 Design Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 37 37 38 39 39 47 47 5 Results and Discussion 5.1 SATA-host Result . 5.1.1 Simulations . 5.1.2 HW-Tests . . 5.2 SSATL - Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 53 53 56 57 6 Future Work 6.1 SATA Host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 SSATL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3 SCSI to SATA Bridge . . . . . . . . . . . . . . . . . . . . . . . . . 59 59 60 61 7 Conclusions 63 Bibliography 67 A SATA Primitives 69 B Mandatory SATA Commands 73 C Test Equipments 75 D Final SATA Host Design 77 E Synthesis Report 79 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Contents xv List of Figures 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 The layers of the SATA protocol . . . . . . . . . . . . . . . Relation between bytes, words, and Dwords . . . . . . . . . The different types of OOB-signals . . . . . . . . . . . . . . SATA startup sequence . . . . . . . . . . . . . . . . . . . . FIS+CRC encapsulated between the SOFP and the EOFP Data flow for the SATA registers . . . . . . . . . . . . . . . SCSI overview . . . . . . . . . . . . . . . . . . . . . . . . . . SCSI architecture . . . . . . . . . . . . . . . . . . . . . . . . Execution of a SCSI write command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 8 9 10 11 14 15 15 17 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8 3.9 3.10 SATA-host system overview . . . . . . . . . Reference design clocking scheme for Phy . Reference design for the link layer . . . . . Link layer - Idle FSM . . . . . . . . . . . . Reference design for the transport layer . . Reference design for the application layer . The simulation system . . . . . . . . . . . . Screen dump of successful OOB simulation Screen dump of a simulation report . . . . . Data FIS reception from device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 22 23 24 27 29 31 32 32 34 4.1 4.2 System overview using SSATL . . . . . . . . . . . . . . . . . . . . SSATL top level flow graph . . . . . . . . . . . . . . . . . . . . . . 47 48 6.1 Virtex-5 in a SATA RAID or Port multiplier solution . . . . . . . . 60 C.1 The ml505 Evaluation Platform . . . . . . . . . . . . . . . . . . . . 76 D.1 Detailed SATA-host overview . . . . . . . . . . . . . . . . . . . . . 78 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi Contents List of Tables 1.1 1.2 1.3 Content briefing . . . . . . . . . . . . . . . . . . . . . . . . . . . . Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Common abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 3 4 5 2.1 2.2 2.3 SATA-specific registers . . . . . . . . . . . . . . . . . . . . . . . . . CDB - General structure . . . . . . . . . . . . . . . . . . . . . . . . SCSI - Primary Command Set . . . . . . . . . . . . . . . . . . . . 13 16 18 3.1 3.2 3.3 3.4 Clock-domains in the phy-layer Test FSM: state description . . Disks used during HW-testing . HDD initialization signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 29 33 33 4.1 4.2 4.3 4.4 4.5 4.6 4.7 SCSI - control byte . . . . . . . . . . . . . . . Primary Command Set translation summary Inquiry request . . . . . . . . . . . . . . . . . Identify device FIS . . . . . . . . . . . . . . . Standard Inquiry response . . . . . . . . . . . Test Unit Ready request . . . . . . . . . . . . START/STOP Unit request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 40 41 41 43 44 46 5.1 5.2 5.3 5.4 5.5 Definitions for simulation results . . . . DMA - simulation summary . . . . . . . Shadow registers - simulation summary Results of the HW-tests, part I . . . . . Results of the HW-tests, part II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 54 54 56 57 A.1 Description of primitives . . . . . . . . . . . . . . . . . . . . . . . . A.2 Primitive encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 71 B.1 Mandatory SATA Commands . . . . . . . . . . . . . . . . . . . . . 73 E.1 Definitions for synthesis report . . . . . . . . . . . . . . . . . . . . 79 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Chapter 1 Introduction This report is the result of a Master of Science in Applied Physics and Electrical Engineering performed at Linköpings university. The thesis was carried out at the department of Electrical Engineering on behalf of Sectra Communications AB. 1.1 Background When humans interact, understanding of each others language is essential to comprehend and avoid misunderstandings. Similarities can be seen when different technical solutions interact but in this case it is often about understanding and mapping of different standardized protocols. Without a correct translation, communication between systems using different protocols is not possible. Two commonly used protocols for communicating with storage devices are Small Computer Systems Interface (SCSI) and Serial-ATA (SATA). SCSI has been used especially by companies with high requirements of reliability and speed. SATA on the other hand is the most common standard in personal computers today. But in contrast to Parallel ATA (SATA’s predecessor), SATA has improved in reliability and speed but is still much cheaper than SCSI solutions. So the base of this thesis is to ”speak” SCSI with SATA-devices. Implementing the ”interpreter” in a hardware-bridge gives Sectra the opportunity to later add some kind of cryptography solution somewhere within the bridge. By adding encryption within a bridge makes it fast and it can be independent of surrounding systems. This enables secure storage of data on a high capacity and low cost SATA disk that can be added on a new or already existing SCSI bus. 1 2 1.2 Introduction The Task The task for this thesis was separated into two major parts with focus on SATA and SCSI protocols respectively. The first part was to implement a SATA host on an FPGA and to be able to communicate with different types of devices (different SATA storage devices). Apart from that, a short investigation concerning a SATA to SATA bridge, on a Virtex-5 FPGA, had to be performed. About what kind of problems, such as time-outs, that can arise and how to solve them. The second part was simply to investigate the theoretical possibilities to ”speak” SCSI commands to a SATA device. This means translation and emulation of commands and behaviour so a SATA device can be used on a SCSI bus and what kind of problems that may appear in a FPGA-based bridge solution in this case. 1.3 Reading Recommendations Books that I can recommend for a basic and good overview of the SATA and SCSI protocol are SATA Storage Technology [2] and The SCSI Bus and IDE Interface [10]. The SCSI book is from 1997 and should not be used for implementation but it’s a well written book that gives a good overview of the huge and rather complex SCSI protocol. It also provides an overview of IDE (PATA standard), which can be interesting and helpful when working with SATA. Both the books shall be complemented by the standard documents when it comes to more precise details. 1.4 Content Briefing 1.4 3 Content Briefing Table 1.1. Content briefing Chapter 1 - Introduction Background, a short introduction, and the stated requirements for the thesis. 2 - Feasibility Study A brief description of SATA, SCSI and the Virtex-5 FPGA. This chapter is a result of literature studies during the work. 3 - SATA Design and Implementation on Virtex-5 This chapter is about how to design a SATA host and how to implement the design on a ml505 Evaluation Platform. Here is also a discussion concerning Virtex-5 as a bridge between a SATA host and a SATA device and what kind of difficulties this might cause the SATA protocol. 4 - Translating SCSI- into SATA-Commands An investigation targeting usage of a SATA device on a SCSI bus. A major part is about how to translate SCSI into commands defined in the SATA standard. 5 - Results and Discussion Presentation of the results and discussions concerning them. 6 - Future Work Clarifies what improvements can be made and how to follow up this thesis. 7 - Conclusions General conclusions regarding the thesis. 4 1.5 Introduction Definitions and Abbreviations Table 1.2. Definitions Definition → ⇒ 0xNUMBER 8b/10b bold Character COMINIT COMRESET Device Dword Frame Generation 1 Generation 2 Hot-plug Host ml505 Payload rx+ rxverbatim T10 tx+ txtypewriter Virtex-5 Word W XY ZP Description means ”assign” means ”lead to” The NUMBER after 0x is in hexadecimal form Coding standard The bold word usually corresponds to a SCSI or SATA command Equal to a byte valid OOB signal valid OOB signal Typical a storage device 32 bits of data Data packet consisting of SOFP + FIS + CRC + EOFP SATA supporting speed up to 1.5 Gbps SATA supporting speed up to 3.0 Gbps Feature of adding/removing devices during operation Unit that execute commands Xilinx Evaluation Platform using Virtex-5 Content of a data-FIS, without header One of two differential inputs to a GTP The other differential input to a GTP The verbatim text is a code example or terminal output during synthesis Technical Committee on SCSI Storage Interfaces One of two differential outputs from a GTP The other differential output from a GTP The typewritten word usually corresponds to a field (or field value) in some kind of frame The FPGA used 16 bits of data WXYZ is a SATA primitive 1.5 Definitions and Abbreviations Table 1.3. Common abbreviations Abbreviation ANSI AT BIST CDB CRC DCM DMA DUT EOF EMI FIS FPGA FSM Gbps GTP HDL HW IP iSCSI IV LBA MVL NACA NCQ OOB OSI PATA PIO Phy PLL rx SATA SCB SCSI SNC SOF SPC SSATL SW TCP tx Definition American National Standards Institute Advanced Technology Built In Self Test Command Descriptor Block Cyclic Redundancy Check Digital Clock Manager Direct Memory Access Device Under Test End Of Frame Electro Magnetic Interference Frame Information Structure Field Programmable Gate Array Finite State Machine Gigabits per Second Gigabit Transceiver Hardware Description Language Hardware Intellectual Property Internet SCSI Initial Value Logical Block Address Macro Verification Language Normal Auto Contingency Allegiance Native Command Queuing Out Of Band signaling Open Systems Interconnection model Parallel AT Attachment Programmed Input/Output Physical layer of SATA Phase-Looked Loop Receiver Serial AT Attachment SCSI Command Block Small Computer Systems Interface Speed Negotiation Control Start Of Frame SCSI Primary Command SCSI / Serial ATA Translation Layer Software Transmission Control Protocol Transceiver 5 6 Introduction 1.6 Requirements The stated requirements for the thesis work is: • All written code shall follow the standards used at Sectra. • A bridge between two interfaces shall be transparent for surrounding systems. • Development of a SATA-controller that can perform some kind of read and write operation to different storage devices. • The controller shall support transfer speeds of both 1.5 Gbps and 3.0 Gbps. • The design shall be implemented to work on a Xilinx ml505 evaluation platform but still easy to map to different hardware. • Virtex-5 FPGA shall be used or be in mind when designing or investigating during the thesis. • Implementation of known protocols shall if possible follow the standards’ rules. • No PC or other equipment shall be involved when using a working bridge. Chapter 2 Feasibility Study This chapter deals with the basics concerning SATA, SCSI, and the Virtex5 FPGA. 2.1 SATA Serial Advanced Technology Attachment (SATA) is a serial link replacement of Parallel ATA (PATA), both standards for communication with mass storage devices. This high-speed serial link is a differential layer that utilizes Gigabit technology and 8b/10b encoding. Some of the features of SATA compared to PATA are increased transfer speed, hot-plug capability, and Native Command Queuing (NCQ). The link supports full duplex but the protocol only permits frames in one direction at a time. The other non-data direction is used for flow control of the data stream. Figure 2.1. The layers of the SATA protocol SATA’s architecture consists of four layers (see Figure 2.1), Application, Transport, Link, and Physical. The Application layer is responsible for overall ATA commands 7 8 Feasibility Study and of controlling SATA register accesses. This layer can also interact with the host so the interface is presented as a PATA. The transport layer places control information and data to be transferred between the host and corresponding SATA device in a data packets. One such packet is called a frame information structure (FIS). The Link layer is responsible for taking data from a FIS and encode/decode it using 8b/10b. It also inserts control characters for flow control and calculates the cyclic redundancy check (CRC) for error detection. Finally the Phy layer’s task is to deliver and receive the encoded serial data stream on the wire. 2.1.1 Dword - Data Representation In the SATA standard the smallest allowed data is a Dword, its 32 bits are divided into four bytes. Where each pair of bytes represent a word and a pair of words represent a Dword. In this way it’s easy to see that odd number of bytes is not allowed in SATA communication. Figure 2.2. Relation between bytes, words, and Dwords The Dwords can be represented by either a data Dword or a so called primitive. A primitive is a predefined Dword like for example start of frame (SOFP ) and end of frame (EOFP ). 2.1.2 Primitives Primitives are Dwords with a purpose to enable and control the serial communication. They all begin with a control character followed by three other characters to fill up the Dword. The control character makes it easy to recognize a primitive from a ordinary Dword of a frame. There is 18 different primitives, all with a dedicated task like for example mark a frame with a SOFP or to provide synchronization with the SY N CP . For a full list of primitives and their Dword representation see Appendix A. 2.1.3 8b/10b - Encoding 8b/10b encoding is rather common in high speed applications, it’s used to provide bounded disparity but still provide enough toggling to make clock recovery possible (synchronize internal clock with the data stream). The bounded disparity means 2.1 SATA 9 that in a string of twenty bits the difference between zeros and ones shall be -2, 0, or 2 and with a maximum runlength of five. The drawback is the created overhead of two bits per byte making the actual transfer speed of for example 1.5 Gbps link to 1.2 Gbps, a loss of 20 %. Since the 8b/10b extends the possible alphabet from 256 symbols to 1024 it can provide detection and encoding of special characters (also called k-characters) in an easy and effective way. This is used in the SATA standard by encoding every primitive as such special characters. 2.1.4 Out of Band Signaling Since SATA devices and hosts always sends junk over its differential channels, when it is idle (otherwise the link is considered lost), there has to be a way of recognizing a signal before a link has been initialized. For this SATA uses so called out of band signaling (OOB) to initialize a connection between a host and a device. The OOB mechanism supports low speed transmission over a high speed connection, such as a SATA link. The OOB signals are non-differential but are sent over a differential channel. This is possible by letting the differential transmitters drive their output pins to the same voltage, resulting in a reduced difference and when a preset threshold limit is reached the receiver can recognize the signal as OOB. Figure 2.3. The different types of OOB-signals As can be seen in Figure 2.3 (figure from [3]) there are three types of (actually two since COMINIT and COMRESET are equal) valid OOB signals where bursts of six ALIGNP are sent with different timing. The importance in the signaling lies in the timing, it does not really matter if an ALIGNP or something else are sent because the receiver only detects the drop of voltage difference between rx+ and rx-. In Figure 2.4 (figure from [3]) the complete startup sequence is visualized and the calibration steps in it are optional to implement. The host sends COMRESET until the device is powered on and can respond with a COMINIT. Upon reception of the COMINIT the host sends a COMWAKE to the device which shall send a COMWAKE back. If this procedure is finished within a correct time the OOBsignaling ends and the differential communication can proceed with determining the link speed (right part of the figure). 10 Feasibility Study Figure 2.4. SATA startup sequence 2.1.5 Physical Layer This section describes the physical interface towards the actual SATA link. Depending on the SATA usage ”host to device” or ”system to system application” the electrical characteristics of the physical layer (phy) might have to be considered (see [7]). The features of the phy can be summarized to: • Transmit/Receive a 1.5 Gbps or 3.0 Gbps differential signal • Speed negotiation • OOB detection and transmission • Serialize a 10, 20, or other width parallel data from the link layer • Extract data from the serial data stream • Parallelize the data stream and send it to the link layer • If a host, provide device status to link layer: Device present, Device absent, or Device present but failed to negotiate • Handle spread spectrum clocking (SSC), a clock modulation technique used to reduce unintentional interference to radio signals At startup the physical layer is in its OOB state and after a link has been initiated it changes to Idle Bus condition and normal SATA communication is now supported. Since the SATA connection is noisy the physical layer detects a frame when it receives a SOFP primitive and it will keep on listening to the incoming signal until an EOFP primitive is received. Except from FISes the SATA traffic also consists of single primitives which all are easy for the Phy to recognize because of their starting control character. 2.1 SATA 11 The Bit Error Rate (BER) of SATA should be max 10−12 which with 8b/10b encoding, a max sized frame of 8192 bytes plus overhead of 8 bytes gives an maximum Frame Error Rate (FER) of: F ERmax = (8192 + 8) · 10 · 10−12 = 8.200 · 10−8 (2.1) So if 1 billion max-sized frames (about 8000 GB of data) are transmitted it will result in approximately 80 malformed frames. 2.1.6 Link Layer This section describes the SATA link layer. The link layer’s major tasks are: • Flow control • Encapsulate FISes received from transport layer • CRC generation and CRC check • FIS scrambling and de-scrambling • 8b/10b encoding/decoding • Power management (optional) A FIS is framed between a SOFP and a EOFP creating the boundaries of a frame. The last Dword before a EOFP is the CRC value for the FIS (see Figure 2.5). The CRC is calculated by applying the 32-bits generator polynomial G(x) in Equation (2.2) on every bit in every non-primitive Dword in a FIS and then summarize (modulo 2) all these terms together with the Initial Value (IV). The CRC IV is fixed to 0x52325032. G(x) = x32 +x26 +x23 +x22 +x16 +x12 +x11 +x10 +x8 +x7 +x5 +x4 +x2 +x+1 (2.2) G(x) = x16 + x15 + x13 + x4 + 1 (2.3) Scrambling a FIS reduces EMI by spreading the noise over a broader frequency spectrum. The scrambling algorithm can be expressed as a polynomial (see Equation (2.3)) or as a linear feedback shift register. The scrambling creates a pseudorandom bit pattern of the data that reduces EMI. The algorithm resets to a IV of 0xFFFF every time a SOFP is encountered at the scrambler. The de-scrambler uses the same algorithm on scrambled data so it retakes its original form. Figure 2.5. FIS+CRC encapsulated between the SOFP and the EOFP 12 Feasibility Study It is important that the CRC calculations are made at original data and that the scrambling/de-scrambling are made between the CRC and the 8b/10b encoding/decoding. The flow control between host and device is managed by sending primitives to one another telling its status (which originates from the transport layer). Some of these primitives can be inserted into FISes. Primitives are not supposed to be scrambled or added to the CRC sum. Internally the flow control are regulated by signaling between the layers. The optional power management should (except from normal management) enable partial power state and slumber. Those states are supposed to be used for power critical systems like notebooks. Slumber is the most power conservative and has a maximum wakeup time of 10 ms compared to partial’s 10 µs. Power management is vendor specific but has some guidelines from the Serial ATA Revision [7]. 2.1.7 Transport Layer The main task for the SATA transport layer is to handle FISes and a brief description of the layer’s features follows: • Flow control • Error control • Error reporting • FIS construction • FIS decomposition • FIS buffering for retransmission There are eight types of FISes each with its specific 8-bit ID and unique header. FISes vary in size from 1 Dword up to 2049 Dwords. The number of bytes in a FIS are always a multiple of four so the transport layer has to fill up with zeros if there are bytes or bits missing for an entire Dword. The flow control in this case is only to report to the link layer that the data buffers are close to over- or underflow. Errors detected are supposed to be reported to the application layer and the detectable errors are: • Errors from lower layers like 8b/10b disparity error or CRC errors. • SATA state or protocol errors caused by standard violation. • Frame errors like malformed header. • Internal transport layer errors like buffer overflow. 2.1 SATA 13 Errors are handled in different ways, for example are resending of complete FISes supported for all kind of FISes besides the data FISes (and the BIST FIS which is used typically during testing), because that would need buffers in size of 8192 bytes (maximum supported FIS size). The max sized non-data FIS is 28 bytes so the costs of a large buffer can be spared. 2.1.8 Application Layer The application layer is the programming interface of the SATA environment consisting of the shadow registers, the SATA-specific registers, and a DMA engine. It also contains the command layer which tells the transport layer what kind of FISes to send and receive for each specific command and in which order those FISes are expected to be delivered. One of the main considerations when the SATA-standard was developed was the backward compatibility to PATA, so the application layer also provides PATA emulation. The shadow registers are copies of the devices taskfile content (similar to PATA). The registers are data port, error, features, sector count, LBA, device/head, status, command, alternate status, and device control. The host sends its shadow registers’ content to the device for each time it writes something in its command or device control registers, telling the device what command or control to expect so the devices’ command layer knows what to do. In the same way the device sends its task file to the host for updating the shadow registers when necessary (see Figure 3.6 in Section 3.5 for an overview of the shadow registers). LBA is the standardized addressing of storage devices making it ”flat” for the host, the LBA is translated in the device to track, head, cylinder, and sector to match its physical structure. In contrary to the shadow registers the SATA-specific registers (see Table 2.1) contain information about host status, error information, host control registers, registers for NCQ, and notification registers. Table 2.1. SATA-specific registers Register SStatus, SCR0 SError, SCR1 SControl, SCR2 SActive, SCR3 SNotification, SCR4 SCR5 - SCR15 Content Link status, speed, link power state Error reporting based on severity or diagnostics Control power state transitions, reseting of link etc. Used to support NCQ Identifies the source of an asynchronous event Reserved for future use The lower layers of SATA write information to the SATA-specific registers making it visible for the application layer (see Figure 2.6). Concerning the shadow registers, the transport layer has full access, both read and write, and the application layer only has restricted access (also see Section 3.5.3). 14 Feasibility Study The DMA engine is the primary unit for data transfers between the host and the device. It is designed for high speed communication and therefore well suited for transferring large amounts of data. The alternative is to use the programmable input output (PIO) data transfer, providing higher reliability to a much lower data rate. But since the FER, as mentioned before (see Equation (2.1)), is so low the DMA transfers are normally used. The command layer is the interface between the transport layer and the rest of the application layer. It is a number of state machines that handles different commands. The input to the command layer state machines is the command that is desired by software and the output to the transport layer is the combination of FISes needed to execute the command. Figure 2.6. Data flow for the SATA registers 2.2 SCSI SCSI or Small Computer System Interface is an ANSI standard for communication with computer peripheral units. In contrast to SATA, SCSI provides higher reliability and a larger set of commands. It gives a single host the possibility to control a greater number and different kinds of devices (the number of possible units on the SCSI-bus depend on the physical interface) compared to SATA. SCSI is a request-response based protocol where a host does the requests and the addressed device responds. The request and/or response are transmitted in a structure named Command Descriptor Block, CDB. Each command is followed by a status byte sent from the device to the SCSI application client. For some commands the status simply is the response. If the status byte is sent during a command execution when it’s not expected, the command shall be terminated by 2.2 SCSI 15 the host. Data are represented by bytes in the SCSI standard. The focus on SCSI in this report is the communication between a initiator and a particular device (see Figure 2.7) so most of the bus and controller parts will be left out. Figure 2.7. SCSI overview Since SCSI-3 was introduced the standard has a layered model similar to the OSImodel used in all sorts of data communications. In Figure 2.8 the blocks over the horizontal line (at the Architecture Model block) represents the command layer and the area under the line represents SCSI protocol and physical layers (figure from [1]). This figure also shows the great variety and width of the SCSI protocol and different kind of physical buses, like the Internet for iSCSI and USB interface for UAS and so on. Figure 2.8. SCSI architecture 16 Feasibility Study 2.2.1 Command Descriptor Block As mentioned in Section 2.2 commands are sent in CDBs (see Table 2.2). A CDB is often a 6, 10, 12, or 16 bytes long data structure that always starts with the command’s Operation Code (byte 0). Each operation has a fixed byte length so the SCSI device receiving a certain CDB knows its length simply by identify the operation code. The LUN field (LUN - Logical Unit Number) indicate which logical unit the command is meant for if the SCSI device has multiple subunits (see Figure 2.7). All fields marked Operation specific are either reserved or simply specific for the certain operation or device. The Allocation length field indicates how long the SCSI response should be and finally the Control Field is reserved for specific SCSI-feature settings. Table 2.2. CDB - General structure Byte # 0 1 2 n-2 n-1 n 2.2.2 Bit Bit Bit 7 6 5 Operation Code LUN Operation specific Bit 4 Bit 3 Bit 2 Bit 1 Bit 0 Operation specific Allocation length Control Field Status, Sense and Error Reporting To indicate the outcome of a command, the command phase is followed of a status phase (see Figure 2.9) when the device sends a single status byte back to the initiator. This status byte might be somewhat vague so the device also stores sense data (a more detailed ”status report”) on every command so the command initiator can request it if necessary. If a status byte is received by the SCSI application client when not expected it shall consider it as an error. In this case the application client shall terminate the contingent command under progress. As mentioned before, the sense data is detailed information about status and error conditions. It contains several subcategories such as sense key and additional sense codes for an even more accurate reporting. 2.2.3 Primary Command Set The set of used SCSI-standards defines the interface, command-set, and functions necessary for an interconnection between an application client and a specific device. As a starting point, Murushak and Jeppsen [5] listed the ”Primary Command 2.2 SCSI 17 Figure 2.9. Execution of a SCSI write command Set” (see Table 2.3), the minimum required SCSI-commands to mount a SATAdrive under many operating systems. All these commands are either from the SCSI primary command (SPC [12]) or from the SCSI command block (SCB [4]) specifications. SPC is the standard with mandatory and optional SCSI commands for general SCSI devices. SBC in the other hand is a command set extension for direct access storage devices, such as disk drives. 18 Feasibility Study Table 2.3. SCSI - Primary Command Set SCSI command Standard Inquiry The standard inquiry command request the SCSI device to deliver information about its manufacturer, model information, and supported features. Read Capacity The read capacity command trigger the device to send information of its storage capacity. Test Unit Ready This is a simple command that simply reports if the device is ready to execute commands or not. Start/Stop Unit Start/stop unit allows the application client to control the peripheral unit’s power condition. Read(6) Older SCSI systems and units require a 6 byte read/write command and all SCSI systems are supposed to support it. Read(6) performs a read request targeting the block address of the device attached with the command. The 6 stands for the command length in bytes. Read(10) Similar to Read(6) but with four extra bytes with optional control features. Write(6) A command that initiates writing of data from host out-buffer to the device. Write(10) Similar to Write(6) but with four extra bytes with optional control features. Request Sense The request sense command is used to get information about errors, link status, and other sense data concerning the logical unit. Mode Sense/Select The Mode sense command is used by the host to get device parameters. Those are ordered by pages and identified by page codes. The mode select in the other hand is used to set operational parameters for the target device. The pages used for the Mode Sense/Select command are sets of operational parameters such as information concerning caching parameters, error recovery, and more. 2.3 Virtex-5 2.3 19 Virtex-5 Virtex-5 is a FPGA family, developed by Xilinx, that uses 65 nm technology. The FPGA family contains a great number of hard IP-blocks that for implementation can be generated by the Xilinx’s program CoreGen. The Virtex-5 used during this thesis is a XC5VLX50T-FFG1136C, an FPGA with 480 programmable input/output blocks (IOB) and lots of features. Features such as; 6 clock management tiles (each with two Digital Clock Managers (DCM) and one Phase-Looked Loop block (PLL)), 12 Gigabit Transceivers (GTP), and 60 times 36 Kb block RAM/FIFO blocks (see the Virtex-5 Family Overview [15]). The clock management tiles provides high-performance clocking features like phase lock loops, dynamic clock configuration, and clock multiplication/division. The PLL is used for generating a fixed phase output signal in relation to its reference signal (using feedback technology). FIFO stands for First In First Out and is referred to as a ”data queue” and the use of a FIFO can commonly be seen in flow crucial applications such as data stacks and clock domain transitions. The GTP (see user guide UG196 [16]) is a very interesting IP-block that provides features such as: • 8b/10b encoding/decoding with k-character detection • Support of Spread Spectrum Clocking • Transfer speeds between 100 Mbps and 3.75 Gbps • Integrated FIFO • OOB signaling support • Differential serialized external interface • Parallelized 8/16-bits internal interface So the GTP has all the necessary features for communicating with SATA-devices of generation 1 or generation 2 making the Virtex-5 FPGA a suitable equipment for SATA implementations. Chapter 3 SATA Design and Implementation on Virtex-5 This chapter is about the design and implementation of a SATA host. The host is evaluated by simulation and hardware testing. 3.1 System Overview The system (see Figure 3.1) is a SATA-host and shall be attached to a hard disk drive (hdd). The system performs the SATA-initialization procedure and then writes some data to the drive via usage of DMA, and thereafter are the same data read from the hdd. All finite state machines (FSM) designed during the thesis are implemented as Moore state machines. The SATA standard [2] defines one FSM for each layer of the protocol, this has been changed in the design by splitting each of those FSMs into smaller ones for readability, design, and implementation purposes. Figure 3.1. SATA-host system overview 21 22 3.2 SATA Design and Implementation on Virtex-5 Physical Layer Design Thy physical (Phy) layer are designed simply by mapping consisting verilog code provided with [3] to VHDL code following Sectra’s coding standard. Figure 3.2. Reference design clocking scheme for Phy The main structure is made of a top level that includes one gigabit transceiver (GTP), one state machine for handling out-of-band (OOB) signaling, one for speed negotiation control (SNC), and a couple of digital clock managers (DCM). A difference here compared to the SATA standard is that that the 8b/10b encoding/decoding will be done here (normally this is done at the link layer) since it’s supported in the Virtex-5’s GTPs. The GTP provides parallelized data in shape of words to the data conversion block. Here these words are paired to complete Dwords, requiring an additional clock domain. This new clock is generated in an extra DCM (see Figure 3.2). The Phy has to generate four different clocks, using DCMs, from the reference clock of 150 MHz and the needed clock domains can be 3.3 Link Layer Design 23 seen in Table 3.1. The DCM supplies zero-phase shifted clock outputs which keeps the clock flanks aligned even if they originates from different DCM-blocks. The outgoing and ingoing data from the link layer to Phy has the width of 32+1+1 (data, k-character, data valid bit) and the data type is called T_SATA_LINK_BUS. Table 3.1. Clock-domains in the phy-layer Domain Dword Word GTP logic SATA-link Gen1 37.5 MHz 75 MHz 150 MHz 1.5 GHz Gen2 75 MHz 150 MHz 300 MHz 3.0 GHz The Word-domain is also used in GTP logic. 3.3 Link Layer Design The link layer implementation is designed with one idle state machine, one receiving (rx) side, and one transmitting (tx) side. The receiving side is designed with a primitive unsuppression block, a descrambler, a CRC-calculator, and a state machine handling flow control. The tx-side is similar to the rx-side and consists of a state machine handling flow control, a CRC-calculator, a primitive suppression unit, and finally a scrambler (see Figure 3.3). Figure 3.3. Reference design for the link layer 24 SATA Design and Implementation on Virtex-5 The data interfaces between the link layer, phy layer, and transport layer are here at the link layer transformed from T_SATA_LINK_BUS to T_SATA_TRANS_BUS with a width of 32+1+1+1+1 (data, k-char, data valid-bit, sof-bit, and eof-bit) or vice versa. The Power Management (PM) block in the figure is optional and not implemented. 3.3.1 Link Layer, Idle State Machine The idle state machine is the top level of the layer and makes sure that the link is idle before any sub state machines can change state (see Figure 3.4 and Listing 3.1). The figure shows the all possible transitions and the transition requirements can be seen in the listing. The state machine handles physical reset, device synchronization problems, and connection loss. When both the tx and the rx blocks are idle, the link layer sends SY N CP to the device indicating its idle state. Figure 3.4. Link layer - Idle FSM 3.3.2 CRC, Scrambling, and Primitive Suppression The CRC calculator and scrambler/descrambler (see Listing 3.2) are implemented according to Section 2.1.6. As can be seen in the listing the scrambler is designed as a linear feedback shift register (LFSR). Primitive suppression is used to ”scramble” primitives. Primitives are not usually scrambled but there might be situations when long runlength of a single primitive occurs (like a period with repetitive HOLDP ), such events have to be scrambled to avoid EMI problems. The solution is primitive suppression, this is a method that keeps the first pair of primitives, the third primitive is substituted by a CON TP , following primitives are substituted by a scrambled seed (not coded as special characters) until a new primitive is recognized (by the primitive suppression block) and is sent unsubstituted. The receiver side has a block (primitive unsuppression) which inverts the suppression so the original flow control appears clear to the rest of the link layer. 3.3 Link Layer Design 25 This block also removes all ALIGNP since those are intended for readjusting the phy layer. Listing 3.1: Link layer - Idle FSM ---------------------------------------------------------------------------------------------P_STATE_CLOCK : process (clk, rst) begin -- process if rst = ’1’ then idle_cs <= C_RESET; elsif rising_edge(clk) then idle_cs <= idle_ns; end if; end process; P_IDLESTATE : process (go2syncesc, idle_cs, phy_rdy, rx_data_in) begin -- process P_IDLESTATE idle_ns <= idle_cs; case idle_cs is when C_RESET => idle_ns <= C_NO_COMM; when C_NO_COMM => if phy_rdy = ’1’ then idle_ns <= C_SEND_ALIGN; end if; when C_SEND_ALIGN => if phy_rdy = ’1’ then idle_ns <= C_IDLE; else idle_ns <= C_NO_COMM_ERR; end if; when C_NO_COMM_ERR => idle_ns <= C_NO_COMM; when C_IDLE => if phy_rdy /= ’1’ then idle_ns <= C_NO_COMM_ERR; elsif go2syncesc = ’1’ then idle_ns <= C_SYNC_ESC; end if; when C_SYNC_ESC => if phy_rdy /= ’1’ then idle_ns <= C_NO_COMM_ERR; elsif (is_primitive(rx_data_in, C_X_RDY_P) or is_primitive(rx_data_in, C_SYNC_P)) then idle_ns <= C_IDLE; end if; when others => null; end case; end process P_IDLESTATE; ---------------------------------------------------------------------------------------------- 26 SATA Design and Implementation on Virtex-5 Listing 3.2: Scrambler implementation ---------------------------------------------------------------------------------------------function calc_scramble ( data : in STD_LOGIC_VECTOR; lfsr_iv : in STD_LOGIC_VECTOR) return STD_LOGIC_VECTOR; function calc_scramble (data lfsr_iv return STD_LOGIC_VECTOR is variable scrambled_and_lfsr : variable lfsr_pol : variable lfsr_fb : variable data_out : begin lfsr_pol := lfsr_iv; : in STD_LOGIC_VECTOR; : in STD_LOGIC_VECTOR) STD_LOGIC_VECTOR(C_Dword_W+15 downto 0); STD_LOGIC_VECTOR(15 downto 0); STD_LOGIC; STD_LOGIC_VECTOR(C_Dword_W-1 downto 0); for i in 0 to 31 loop lfsr_fb := lfsr_pol(15); data_out(i) := lfsr_pol(15) xor data(i); lfsr_pol(15) lfsr_pol(14) lfsr_pol(13) lfsr_pol(12 downto 5) lfsr_pol(4) lfsr_pol(3 downto 1) lfsr_pol(0) end loop; := := := := := := := lfsr_pol(14) xor lfsr_fb; lfsr_pol(13); lfsr_pol(12) xor lfsr_fb; lfsr_pol(11 downto 4); lfsr_pol(3) xor lfsr_fb; lfsr_pol(2 downto 0); lfsr_fb; scrambled_and_lfsr(C_Dword_W+15 downto C_Dword_W) := lfsr_pol; scrambled_and_lfsr(C_Dword_W-1 downto 0) := data_out; return scrambled_and_lfsr; end calc_scramble; ---------------------------------------------------------------------------------------------- 3.3.3 Link Layer Transceiver Block The tx block is inferior to the rx block (in the sense that rx-data has higher priority than tx-data) and therefore remains in it’s idle state as long as the rx-state machine is non-idle. The tx block gets data from the transport layer’s tx-FIFO when the transport layer requests a transmission and the rx block is idle, the FIS from the FIFO shall be framed between a SOFP , a calculated CRC-value, and a EOFP primitive. The frame transmission will be scrambled and transmitted according to the standardized flow control using primitives. When the transport layer indicates that a FIS under progress is to be discarded the tx-block is supposed to stop transmission and send SY N CP to the device so it also can discard the malformed FIS. If the tx-FIFO becomes empty but the FIS is not completely transferred then HOLDP primitives are sent to the device indicating a ”pause” in the FIS. 3.3.4 Link Layer Receiver Block The rx block act upon received FISes and primitives. When receiving a SOFP from the phy the rx-block shall reply by sending R_IPP if the transport layer’s rx-FIFO is not full or reply with HOLDP primitives until the FIFO is not full any more. 3.4 Transport Layer Design 27 A CRC-check is being executed at every received FIS (after descrambling) and the result is forwarded to the transport layer and to the transmitting device. If a SY N CP is received during a FIS reception the rx block will inform the transport layer that the FIS shall be discarded. Every FIS is unpacked by removing all primitives and the CRC from the FIS. When the EOFP arrives to the rx block it replies with a R_OKP if the reception was error-free otherwise it replies with a R_ERRP . 3.4 Transport Layer Design The transport layer is designed with one idle state machine, two FIFOs, one interface towards the shadow register and with one interface for DMA FIS handling. The idle state machine has the over all control of the layer and is responsible for that over- and under-runs in the FIFOs are prevented. The dashed block in Figure 3.5, Built In Self Test (BIST), is an optional feature for SATA that only is used for testing and not during actual drive. The BIST feature is not implemented in this design. Figure 3.5. Reference design for the transport layer 3.4.1 Top Level and Idle State Machine The top level of the transport layer is designed to be a traffic controller, since all incoming and outgoing data goes trough the FIFOs. The idle state machine is 28 SATA Design and Implementation on Virtex-5 also present at this level making sure of what type the received FISes have and then activate the target subsystem for the certain FIS-type for handling of its payload. The two FIFOs where generated with Xilinx Core Generator, each one as a standard FIFO with a row width of 36 bits (to match T_SATA_TRANS_BUS), a depth of 512 rows, and a programmed full threshold of 28 rows (maximum nondata FIS). 3.4.2 Transport Layer, Shadow Register Interface This subsystem is built of a group of small FSMs each with a dedicated task like for example PIO and encoding/decoding of register FISes. So both encoding and decoding of all register FISes, set device FISes, and PIO FISes are performed here. The cause of the PIO block to be placed here is that the PIO-setup and PIO-data input are written to the shadow register. PIO-data output are sent in an ordinary data-FIS but the received data are then slitted up from Dwords to words, to fit the shadow registers’ Data register. This of course fills the rx-FIFO since no change of clock domain is done, so the problem has to be solved by the link layer’s flow control mechanism. 3.4.3 Transport Layer, DMA Interface The DMA interface is similar to the shadow register interface but it handles all the DMA related FISes such as DMA read/write, DMA setup, and so on. In the DMA case the incoming data never goes through the shadow registers and therefore the DMA transfer is much faster compared to the PIO. 3.5 Application Layer Design The application layer (see Figure 3.6) is designed with an idle state machine, a shadow register, a command layer, and a test state machine. The idle state machine primarily handles control signals and reset/start-up actions. The test state machine is a simple test application used for HW function verification. The command layer is an essential part of the application layer, it makes sure that the order of necessary FISes for each supported command are sent and/or received in the correct order. The dashed blocks in Figure 3.6 represent possible features for a host with full functionality, features that are not yet implemented. 3.5.1 Application Layer, Top Level The top level of the application layer is basically a couple of FSMs, one idle and, one for testing the system. All control signals also goes through this part of the layer, to or from SW (i.e. the test FSM in this implementation) or the transport layer. The idle FSM’s task is to take care of the layer’s reset condition and to provide the command layer with desirable commands (from SW or in this case the 3.5 Application Layer Design 29 Figure 3.6. Reference design for the application layer test FSM). The test FSM provides a simple testing procedure for verification of SW-initialization and host ability to use DMA read/write (see Listing 3.3 and Table 3.2). Table 3.2. Test FSM: state description State C_ID_DEV C_DMA_SETUP1 C_DMA_WR C_DMA_SETUP2 C_DMA_RD Description Wait for Phy to complete the HW-initialization, set next command to identify device and wait until it’s finished Waits until the application layer is in idle state Set next command to DMA write and wait until it’s finished Waits until the application layer is in idle state Set next command to DMA read and waits until it’s finished 30 SATA Design and Implementation on Virtex-5 Listing 3.3: Test - FSM at the application layer ---------------------------------------------------------------------------------------------P_TEST_STATE : process (app_test_cs, app_cs, cmd_complete) begin -- process P_CMD_STATE app_test_ns <= app_test_cs; case app_test_cs is when C_ID_DEV => if cmd_complete = ’1’ then app_test_ns <= C_DMA_SETUP1; end if; when C_DMA_SETUP1 => if app_cs = C_IDLE then app_test_ns <= C_WR_DMA; end if; when C_WR_DMA => if cmd_complete = ’1’ then app_test_ns <= C_DMA_SETUP2; end if; when C_DMA_SETUP2 => if app_cs = C_IDLE then app_test_ns <= C_RD_DMA; end if; when C_RD_DMA => if cmd_complete = ’1’ then app_test_ns <= C_DMA_SETUP1; end if; when others => null; end case; end process P_TEST_STATE; ---------------------------------------------------------------------------------------------- 3.5.2 The Command Layer The command layer are similar to the transport layer but instead of knowledge about FIS structure this layer knows in which order to send and/or receive FISes for all kinds of SATA commands. The design is made to handle six groups of commands; DMA in, DMA out, PIO in, PIO out, non-data, and device diagnostic commands. Each group supports between one and nine commands and combined they form the set of mandatory SATA-commands (see Appendix B). 3.5.3 Shadow Registers’ Design The shadow registers are, like the name indicates, a group of registers (see Figure 3.6) which are access restricted for the rest of the application layer. This restriction is based on the value of the shadow Status register that among others indicates if the device is busy and/or ready. If the device is busy or not ready the rest of the layer has to wait for the status to change before using any of the shadow registers. Figure 3.6 also shows which registers that are accessible for writing or reading initiated by the application layer. All the registers are changeable by the transport layer due to received FISes from the device. 3.6 Simulations 3.6 31 Simulations Simulations are a powerful tool when testing, verifying, and creating the design functionality. To test different situations and cases before synthesizing the design into a HW implementation is a very time saving procedure. The simulation gives an opportunity to study device’s behaviour in an easy used environment compared to reality. Even when new problems have been identified during HW-tests the easiest way to a solution has been to recreate the problem in a simulated environment. All simulations were done in Modelsim 6.3c (for an example of how a simulation looks like see Figure 3.8), with a testbench written in VHDL and test cases written in MVL (Macro Verification Language). MVL is developed by Dr. Daniel Wiklund and is similar to the C programming language. It is intended for implementation of test cases in HDL development. The MVL code is then parsed to a dispatch file and several command files. The dispatch file controls the execution of the command files and the command files contain all the commands for the testbench. Figure 3.7. The simulation system All the testbenches used are at least built up by three different VHDL files. One with the main testbench including the device under test (DUT), one package file with all needed components and functions, and one parser for the generated command files. See Figure 3.7 for a overview of the simulation system To be able to detect the OOB signals the testbenches’ physical layer was designed with two GTPs, one for generation 1 speed and one for generation 2 speed. Other advantages of using GTPs in the testbench design are that they take care of 8b/10b encoding/decoding, detection of k-characters, data validity check, and data transmission. This makes the rest of the testbench easier to implement only handling words and Dwords. Test data are handled as integers and an int is defined in range up to 231 , so Dwords (32 bits) have to be divided into a couple of words before they can be evaluated in the simulations. The MVL test cases are designed to test functionality in all the SATA-layers of the DUT by sending different types of command FISes, various data FISes, and 32 SATA Design and Implementation on Virtex-5 so on. Individual testbenches and test cases where constructed for each layer and for the complete SATA-host system to get as accurate tests as possible. When the simulation has ended it results in a HTML-based simulation report (see Figure 3.9). Figure 3.8. Screen dump of successful OOB simulation Figure 3.9. Screen dump of a simulation report 3.7 Hardware Tests Hardware tests were made to verify the SATA-host’s functionality in practice. The HW tests where made with a Xilinx ml505 Virtex-5 FPGA Evaluation Platform acting SATA-host and different hdd:s acting device (see Table 3.3). During testing also two types of SATA-cables where used, one of length 0.5 m and one of length 1 m. To be able to verify the functionality the Xilinx’s software ChipScope were 3.7 Hardware Tests 33 used, it’s a real-time verification product that provides on-chip debugging. For a complete list of test equipments see Appendix C. Table 3.3. Disks used during HW-testing Device Samsung HD250HJ Western Digital WD3200AAKS Ocztechnology OCZSSD2-1C32G Hitachi HTS541680J9SA00 3.7.1 Features 250GB, SATA3.0, NCQ, Hot-plug 320GB, SATA3.0 32GB, SATA3.0, Solid state 80GB, SATA1.5 HW-test Part I, Start-up The HW-test part I tests the SATA-hosts ability to start and complete the SATA start-up procedure, OOB-signaling, speed negotiation, and reception of a disk ”signature” (a register FIS from device to host). After the OOB and speed negotiation (see Section 2.1.4) is complete the device is supposed to transmit a device signature to the host so that the host knows of what type the device is, in our case a hard disk drive. This means that the signature should look like in Table 3.4, but beware of the transmission order of a Dword (see Figure 2.2). Table 3.4. HDD initialization signature Dword 0 1 2 3 4 CRC 3.7.2 Scrambled data 0xC38276B9 0xBF26B369 0xA508436C 0x3452D355 0x8A559502 0x0FA400D0 Unscrambled data 0x01500034 0xA0000001 0x00000000 0x00000001 0x00000000 0xB4BEBECB HW-test Part II, Write and Read The second HW-test verification was based on the SATA SW-initialization and a couple of DMA commands. The test can be chronological ordered as followed: 1. After the hdd ”signature” has been received the host shall complete the SW-initialization by sending an identify device, a PIO in command that triggers the device to send a sector of configuration and disk information data to the host. 2. Now both HW and SW are up and running so the host transfers one sector (512 bytes) of data to the device, with help of a DMA write. 34 SATA Design and Implementation on Virtex-5 3. After transfer the host commands a DMA read of the same sector that was written. 4. If the test has a positive result the CRC of the sent and received data are the same. The process is monitored and verified in ChipScope and can look like in Figure 3.10. The figure is a screen dump from ChipScope, triggered on a data FIS reception. In detail the figure shows a package of one disk sector ending with a CRC of 0x5FA84B8E, which is correct, calculated of one Dword of value 0x00000046 (start of data FIS) followed by 128 Dwords with value 0x12345678. The gap in the data payload (where rx_data_k is high) is a couple of ALIGNP primitives sent by the device. One can also see the flow control managed by primitives where rx_data_k and tx_data_k are high. Figure 3.10. Data FIS reception from device 3.8 SATA to SATA Bridging Since one of Sectra Communications’ orientations are to securely encrypt data, a bridge between SATA and SATA was to be considered and what to think about when such a bridge are to be designed. Except from the encryption part, problems that rises from bridging a SATA-link are FIS collisions, error detection and handling, flow control, initialization and speed negotiation problems, and timeouts. A SATA to SATA bridge can with advantage be designed similar to a Port Multiplier (see the SATA standard [7] or D. Anderson [2]) which is a standardized bridge that faces the same SATA-related dilemmas. All flow control managed by primitives are done within the bridge except R_OKP and R_ERRP that indicates that 3.8 SATA to SATA Bridging 35 a FIS has been delivered error-free or not. By doing this, the introduced latency for SATA-primitive actions is reduced in all but those two cases mentioned. By not respond with R_OKP or R_ERRP the bridge do not have to buffer any FIS for resending purposes. When an error is recognized by the bridge it shall send SY N CP in both directions making the host and the device to escape and discard the FIS in progress and start their own error-handling mechanism. When it comes to initialization of the SATA-link the bridge start OOB-signaling and speed negotiation with both host and device (initialization with device starts when OOB COMRESET detected from host). If the agreed speeds matches each other the link is up, but if not the bridge throttle its tx-links forcing the host and device to reset and the bridge now only speed negotiate at the lowest accepted speed. When it comes to timeouts that’s host specific and determined by software so the only thing that can be done is to minimize the latency and/or set the timeout parameters in the software. Collisions might appear when both host and device send X_RDYP simultaneously declaring that they both are ready to send one or more FISes. This can only happen when using NCQ since otherwise SATA-commands are constructed in such a way that a collision is impossible. The solution to a collision is simply that the bridge sends R_ERRP to the host so it discards its FIS, receives the FIS from the device, and then if necessarily resend the earlier discarded FIS. If one of the links either from host or device in any way becomes inoperable the bridge shall send a SY N CP in the opposite direction which yields one or more retries of sending FISes, this will reduce the latency associated with a timeout. When working with data in the bridge, for example encryption, this can easiest be done at the transport layer. Since it knows the structure of different FISes and then easy recognize data FISes and can therefore work with the FIS payload without risking to manipulate non-data FISes. The CRC also has to be recalculated after data has been altered. Chapter 4 Translating SCSI- into SATA-Commands This chapter presents a command translation model inspired by the model in Overby [8] and by the ideas from Marushak and Jeppsen [5]. The model is meant to be used when translating SCSI-commands to SATA-commands. There are also some design proposals for a SCSI / Serial ATA Translation Layer (SSATL) implemented in a Virtex-5 FPGA. 4.1 Command Translation Model As pointed out in [5], the command translation model has to handle two different situations, translate a SCSI request from the application client to SATA, and to translate information from a SATA-device to a SCSI response. This can be done in three ways, direct translation, emulation, or partial emulation. The direct translation can be implemented when a SCSI command can be directly mapped to a SATA command due to their equivalent functionality. Emulation on the other hand completely lack SATA-compability, so the SSATL has to emulate the behavior and respond to the SCSI-request without actually communicating with the disk. If there is SATA-command that partially match a SCSI-command then a mix of translation and emulation can be performed and this is what is called partial emulation. 4.1.1 Support for Queued Commands One of the greatest advantages with SATA compared to other storage standards is the NCQ feature, why it has to be supported by a SSATL. SCSI also supports command queuing of all commands so the SSATL will need a SCSI-command stack meanwhile the SATA’s NCQ is limited to queue read and write commands. 37 38 Translating SCSI- into SATA-Commands When translating SCSI queued commands to NCQ commands there are some things to consider [8]: • Maintain the mapping between the NCQ-tags and the SCSI-command identifier. • The SSATL has to indicate its queue support to the SCSI application client by setting the corresponding bits in the standard inquiry command. • SSATL shall use the queue depth supported by the attached device, indicated in the Identify device SATA-command, data word 75. • If the SSATL receives a SCSI-command corresponding to a non-queued SATA command while the SATA-device is busy, the SSATL can either stack the new SCSI command for later execution or return task set full status for the command or return busy status for the command to the SCSI application client. • If SCSI command priorities are used (16 prio-levels), those shall be translated to NCQ priorities (2 prio-levels) as: SCSI-prio 0 mapped to NCQ-prio 0, SCSI-prio 1-3 mapped to NCQ-prio 1, and SCSI-prio 4-15 mapped to NCQ-prio 0. A SCSI priority of 0 corresponds to vendor-specific or no priority, SCSI priority 1 is the highest priority, 2 the second highest, and 15 the lowest priority. In SATA and NCQ the prio 1 is the higher than prio 0. • Errors normally terminates all queued and non-queued SATA-commands in progress and a suitable error message shall be sent to the SCSI application client. 4.1.2 Translating LBA According to [8] the SATA LBA address can be translated in two ways, direct logical block mapping or indirect logical block mapping of SCSI logical blocks. The SCSI host can ask the disk about its physical sector size by sending it a read capacity command. If the disk uses a sector size other than 512 bytes it is indicated in the identify device SATA command, by bit 12 in word 106 is set to one. In this case the sector size can be calculated by multiplying the returned value in words 118 down to 117 by two (also in the identify device). If the host prefers a different sector size than the disks physical one, it can try to change it with for example the SCSI format unit (not described in this report) command with parameters set such that the sector size shall be changed. If the SSATL supports different sector sizes it can use the direct (linear) block mapping in cases where the host sector size is a multiple of the disks sector size. 4.1 Command Translation Model 39 If not (some SCSI systems uses sector sizes that’s not a power of two, like fibre channel’s 528 bytes) the SSATL has to use a more complex indirect mapping of the LBA. In both cases the LBA translation has to be reversible so one can get the original address back. 4.1.3 Translating SCSI Control Byte Each SCSI command request contain a control field of one byte (see Table 4.1) that is used primarily for linked commands and error handling. The vendor-specific field can be used by the SSATL if desired otherwise this field can be ignored. The NACA-bit indicates usage of Normal Auto Contingency Allegiance, an error handling protocol not supported by SATA so if this bit equals one for any SCSI request that should result in a command termination. The Link-bit is used (set to one) if the request is one in a chain of commands, linked commands. This is not supported by SATA and shall result in a command termination. So if either the NACA- or Link-bit is set to one the SSATL shall terminate the command by sending a check condition status response with sense-key set to illegal request and the additional sense code set to invalid field in CDB [8]. The sense key and code are error information that is stored by the SSATL until requested by the SCSI application client or sent together with the status byte if the system supports so called auto-sense. For detailed information regarding the check condition command termination procedure, see Penokie [9]. Table 4.1. SCSI - control byte 4.1.4 Bit 7 and 6 Bit 5,4 and 3 Vendor-specific Reserved Bit 2 NACA Bit Bit 1 0 Obsol. Link Translation of the Primary Command Set This section handles the translation and/or emulation of the commands in the primary command set (see Section 2.2). The summery of the translation from Murushak and Jeppsen [5] can be seen in Table 4.2. The following subsections contain more detailed information about each of those commands. 40 Translating SCSI- into SATA-Commands Table 4.2. Primary Command Set translation summary SCSI-command Standard Inquiry Read Capacity Test Unit Ready Start/Stop Unit Mapping method Direct translation Direct translation Partial emulation Partial emulation Read(6) Read(10) Write(6) Write(10) Request Sense Mode Sense Mode Select Direct translation Direct translation Direst translation Direct translation Partial emulation Partial emulation Partial emulation SATA-command Identify Device Identify Device Check power mode Flush cache, Flush cache exp, Standby, Standby immediate, Verify sectors, Media eject any DMA read any DMA read any DMA write any DMA write Check power mode Identify Device Set Features Standard Inquiry When the SSATL receives a SCSI CDB from the application client with a command code of 0x12 and bit 0 of the second byte in the CDB equals zero, the requested response is the standard inquiry. The first thing the SSATL should do is to send an identify device SATA-command to the disk. The identify device is a four way command, starting with a host to device register FIS indicating the start of the command. Then two FISes are sent from device to host, one PIO-setup, and one PIO-data FIS. The information received in the data-FIS shall be used to construct the inquiry response CDB, see Example 4.1. Example 4.1: Direct translation: Standard inquiry The first thing to do is to translate the SCSI command request: • Identify the command as an inquiry (see Table 4.3), in this case by confirming that the Operation Code is 0x12. • Check the value of the EVPD (Enable the Vital Product Data). If EVPD is zero that indicates standard inquiry. Else if EVPD is one the expected response shall be a Vital Product Data, VPD, if that’s supported or otherwise terminate the command by sending back an appropriate status. • The Page Code field is used to specify which VPD desired by the SCSI application client (when EVPD = 1, the field can be ignored if EVPD = 0). • The Allocation Length field indicates the space allocated for the response and indirect the length of the response. 4.1 Command Translation Model 41 • Finally the control field shall be handled as described in Section 4.1.3. Table 4.3. Inquiry request Byte # 0 1 2 3 4 5 Bit Bit Bit Bit Bit 7 6 5 4 3 Operation Code: 0x12 Logical Unit # Reserved Page Code Reserved Allocation Length Control Field Bit 2 Bit 1 Bit 0 EVPD Assume that the CDB is a standard inquiry and no problems where found in the Control Field. Now the SSATL shall make the SATA application layer to send an identify device command to the SATA-device. During the identify device, the device sends a data-FIS to the host containing 256 words (512 bytes) of device information and settings (see Table 4.4). Table 4.4. Identify device FIS Word # 0 ... 23-26 27-46 ... 76 ... 100-103 ... 106 ... 117-118 ... 255 Byte 1 Byte 0 Bit 15 ←→ General configuration ... Firmware revision Model number ... SATA capabilities ... Maximum user LBA ... Sector size ... Words/logical sector ... Integrity word Bit 0 Now when the SSATL has received the identify device data it can start the second part of the translation, constructing a valid standard inquiry response (see Table 4.5). The following list shows the byte-wise construction of the standard inquiry response. • Byte 0 → 0x00, indicating a direct accessible storage device. 42 Translating SCSI- into SATA-Commands • Map the Removable Media Bit, RMB, (bit 7) of the identify device’s General Configuration field to the standard inquiry’s RMB field. • Byte 2, the Version field, shall be set to verify the supported SPC-version of the SSATL (e.g. 0x05 for SPC-3). • Byte 3 → 0x02, indicating the response format as in Table 4.5 and that the optional settings NACA and HiSup are not supported. • The Additional length field shall be set to match the Allocation Length field from the request. • Byte 5 to 7 → 0x00, indicating that those settings are not supported. • Byte 8 to 15, the T10 Vendor identifier, shall be set to the ASCII code for ”ATA” followed by five ASCII spaces: A = 0x41, T = 0x54 and space = 0x20. • The inquiry’s Production identifier bytes shall be translated to the identify device’s Model number (first 16 bytes). • The inquiry’s Product revision level bytes shall be translated to the identify device’s Firmware revision field. The Firmware revision has twice the size of the Product revision level so the translation shall be either the first or last half of the Firmware revision field (the half not corresponding to only ASCII spaces). • The Version descriptor field tells the application client which SCSI-standards that are supported by the SSATL (up to eight). All available standards can be found in SCSI Primary Commands - 3 [12]. • Remaining fields in the standard inquiry response can be set to all zero. 4.1 Command Translation Model 43 Table 4.5. Standard Inquiry response Byte # 0 1 2 3 4 5 6 7 8-15 16-31 32-35 36-55 56-57 58-73 74-95 96-n Bit Bit Bit Bit 7 6 5 4 Peripheral Device qual. RMB Reserved Version Obsol. Obsol. NACA HiSup Additional length (n-4) SCCS ACC TPGS BQue EnSer VS MultiP Obsolete WBus16 Sync T10 Vendor identification Product identification Product revision level Vendor-specific information Reserved Version descriptor 1-8 Reserved Vendor-specific data Bit Bit 3 2 Type code Resp. Bit 1 Bit 0 data format 3PC Reserved Proj. MChngr Obsolete Addr16 Linked Obsol. CmdQue VS For a more detailed description of the translation see SCSI / ATA Translation - 2 [8]. Read Capacity The Read Capacity request is a ten byte structure starting with 0x25 and followed by nine 0x00 (otherwise terminate the command like in Section 4.1.3). The response is eight byte long and the requested information can be found in the SATA identify device command (see Table 4.4). The first four bytes shall be mapped to the smallest value of the last LBA (identify device’s field Maximum user LBA) of the device or 0xffffffff. If the LBA is the larger one this normally triggers a request of Read Capacity (16), the 16 byte version is similar to Read Capacity but supports a larger address space. The last four bytes equals the device’s sector size in bytes, this is set to 512 bytes (0x00000200) if bit 12 in Sector size field is equal to zero in the SATA identify device. If this bit is equal to one the four last bytes shall be equal to the value of the Words/logical sector field multiplied by two. 44 Translating SCSI- into SATA-Commands Test Unit Ready This command is two bytes long, operation code 0x09, and the control byte. The response shall be either check condition status or good status. The status is determined due to the following order: 1. Device busy indicated in the shadow register ⇒ Check condition status with sense key Not ready and additional sense code Logical unit not ready, cause not reportable. 2. Device is executing an earlier demanded Start/Stop unit SCSI command ⇒ Check condition status with sense key Not ready and additional sense code Logical unit not ready, initializing command required. 3. The device is running a BIST test ⇒ Check condition status with sense key Not ready and additional sense code Logical unit not ready, self-test in progress. 4. The SATA status registers indicate that no device is present ⇒ Check condition status with sense key Not ready and additional sense code Medium not present. 5. The DF (Device Fault) bit in SATA Status register was set to one after the last executed SATA-command ⇒ Check condition status with sense key Hardware error and additional sense code Logical unit failure. 6. SSATL tells the SATA application layer to send a Check power mode command to the device. If the command is executed without errors ⇒ Good status. Otherwise ⇒ Check condition status with sense key Not ready and additional sense code Logical unit does not respond to selection. So the Test Unit Ready SCSI command is either emulated or partial emulated if the SATA check power mode command is being used to check the device’s readiness. Example 4.2: Emulation: Test Unit Ready The Test Unit Ready SCSI command sent by the SCSI application client looks like in Table 4.6 Table 4.6. Test Unit Ready request Byte # 0 1 Bit Bit Bit Bit 7 6 5 4 Operation Code: 0x09 Control Field Bit 3 Bit 2 Bit 1 Bit 0 4.1 Command Translation Model 45 Assume that the busy bit in the SATA shadow register’s status register is one, the response in this case should be: Check condition status with sense key Not ready and additional sense code Logical unit not ready, cause not reportable. As mentioned in Section 2.2 the status is reported back to the application client as a single byte and Check condition corresponds to 0x02. The sense key and additional sense code are both stored in the SSATL, first delivered to the application client when requested (if auto-sense is not implemented). Start/Stop Unit The Start/Stop Unit SCSI command (operation code 0x1B) do not have a single SATA command corresponding to its functionality but with combinations of SATA commands the Start/Stop Unit command can be partially emulated. The SATA commands transferred to the device differs depending of the Start/Stop Unit fields’ content (see Section 9.11 in Overby [8]). The translation procedure primarily depends on the Start/Stop Unit’s power condition field, that can be either start valid, active, idle, standby, or force_s_0. All other values shall result in a Check condition status with sense key set to Illegal request and the additional sense code to Invalid field in CDB. Example 4.3: Partial emulation: Start/Stop Unit First determine the CDB and translate it to a SATA-command if necessary. Assume that the received CDB from the SCSI application client looks like in Table 4.7. If the IMMED (immediate start/stop) bit is equal to zero that indicates that status shall bee sent back after completion of the command instead of immediately when the CDB has been verified by the SSATL. The Power condition set to start valid (0x0) and the Power cond modifier = 0x0 (this field is only used as additional information when Power condition = idle). The NFLUSH bit (No flush) indicates if cached data in the SSATL (if cache used) shall be written to the device or not. The LOEJ bit (load eject) indicates if the storage medium shall be unloaded or loaded. The START bit tells the SSATL to transition to either stopped or active power state. Assume that NFLUSH, LOEJ and START all equals zero in this example. The SSATL reads the power condition field and recognizes it as the start valid condition, now looking at LOEJ and START bit that’s zero. If the IMMED bit now had been one the SSATL would have returned a good status to the SCSI application client. Instead the IMMED is zero making the SSATL to tell the SATA application layer to send a flush cache to the device. After the flush cache command the SSATL requests a standby immediate SATA-command whereupon an error free transmission the SSATL send a good status to the SCSI application client. The SSATL now consider the SATA device to be in the SCSI defined ”Stopped power state”. 46 Translating SCSI- into SATA-Commands Table 4.7. START/STOP Unit request Byte # 0 1 2 3 4 5 Bit Bit Bit Bit 7 6 5 4 Operation Code: 0x1B Reserved Reserved Reserved Power condition Control Bit 3 Bit 2 Bit 1 Bit 0 IMMED Power cond modifier Res. NFLUSH LOEJ START Read(6), Read(10), Write(6), and Write(10) All the read and write SCSI commands can be mapped to one or more SATA DMA read and write commands, depending on the supported SATA features (e.g. NCQ or 48-bits LBA). The LBA shall be translated according to Section 4.1.2 and the SSATL shall operate on the LBAs corresponding to the block address field and transfer length field in the SCSI-command. If the SCSI command is Read(6) or Write(6) and the transfer count field is 0x0000 the SSATL shall translate this to 256 instead of 0. This leads to that the SATA device operates on LBAs corresponding to the specified 256 SCSI blocks instead of perform an operation on 0 blocks (Overby [8]). Request Sense A Request Sense command (operation code 0x03) requests any available sense data stored by the SSATL. The SSATL determines if there is sense data to deliver and tells the SATA application layer to send a check power condition command to the device to set the value of the power condition sense. If there is no sense data the SSATL shall complete the Request sense command by sending Good status with sense key No sense and the additional sense code No additional sense. The SSATL can be designed to auto-sense, which means that it will send sense data together with the status byte at end of SCSI-commands instead of waiting for a Request sense from the application client. 4.2 SCSI / Serial ATA Translation Layer - SSATL 47 Mode Sense/Select These two SCSI commands claim the use of mode page storage in the SSATL. The Mode Sense command simply gets information from the one addressed mode page meanwhile Mode Select intends to change operation parameters in an addressed mode page. For a full list of supported mode pages and features see (Section 10.1 in Overby [8]). Most of the parameters are not supported by SATA-devices and should therefore be emulated. But the supported features are first written to the mode pages at the arrival of the first identify device. If one of the SATA-supported parameters is changed by a Mode Select command this should be mapped to a set features SATA command. 4.2 SCSI / Serial ATA Translation Layer - SSATL The SCSI / SATA Translation Layer or SSATL is a design implementing the Command Translation Model defined in Section 4.1 and is thought to be used in a Virtex-5 FPGA. The FPGA shall also contain an SATA-host implementation and some kind of SCSI interface (see Figure 4.1). Figure 4.1. System overview using SSATL 4.2.1 Design Considerations The design’s top level flow graph can be seen in Figure 4.2 and the two translation blocks might be constructed like the SATA command layer with specific FSMs for each supported SCSI request/response. 48 Translating SCSI- into SATA-Commands Figure 4.2. SSATL top level flow graph Special consideration shall be taken in the following cases: • Sector size differences between the host and device; what happens if the host has a sector size of 256 bytes and the device one of 512 bytes and the host wants to write one sector of bytes? • If an unsupported response length is requested, SATA-protocol always uses multiples of four bytes meanwhile SCSI is single byte oriented. • Shall a SCSI command queue be used or shall the SSATL terminate every incoming SCSI-request while the device is not idle? Consider the situation 4.2 SCSI / Serial ATA Translation Layer - SSATL 49 when the SSATL is attached to a SCSI-bus with more than one host. How shall the SSATL handle the situation when the SATA-disk are executing a command from host A while it receives a new command from host B? • Emulation of Mode Pages and Sense Data. • The change between byte and Dword domain also requires a clock-domain crossing. • How shall Self-Monitoring, Analysis, and Reporting Technology (S.M.A.R.T) SATA feature be supported? S.M.A.R.T is a feature that many SATA devices have and it can be used for prediction of faults or device degradation. • Many SCSI controllers supports ATA pass-through commands [8], ATAcommands encapsulated into SCSI CDBs. This should ease the implementation but the bridge loses its transparency in one sense, due to the systems knowledge of an attached device that supports ATA-commands. This also influences the requirements of the initiator since it has to know ATA commands. Sector Size Differences The problem appears when the sector size of the initiator differs compared to the device’s sector size, both when it is greater or smaller. For example if the initiator’s sector size is 1024 bytes and the device’s sector size is 512 bytes there is not much of a problem, each initiator sector are translated to exactly two device sectors. The real dilemma appears when the initiator’s sector size is smaller than the device’s or if the two sector sizes is not multiples of each other. There are several approaches to this problem: 1. The host sectors are mapped to any number of disk sectors which are large enough and the possible ”sector remainder” are mapped to all zeros. This would have some disadvantage in overhead on the SATA link making it ”slower” and in a practical loss of storage space on the SATA disk. Problems rises if you for some reason want to attach the SATA-device to some other peripheral since that one has no idea of the sector mapping. 2. Only allow logical sector sizes that are multiples of the physical sector size of the disk. This will result in linear mapping in all allowed cases and a I/O-buffer can be used to solve the translation. 3. Do not allow different sector sizes, only choose devices and hosts that share the same sector size or when the host sector are greater then the disk sector and is an even multiplier of the disk’s sector size, so the mapping becomes trivial. 4. Implement some kind of adaptive LBA translation model that allows both linear and non-linear mapping of sectors. It is still important that the logical block is aligned with a physical block so the boundaries are clear. 50 Translating SCSI- into SATA-Commands If some SCSI-command requests an unsupported sector size the SSATL can terminate the command by sending a check condition status byte and set the sensekey and/or the additional sense code to something good. For example illegal request and invalid field in CDB respectively. This will force the host to adapt instead of the device (should work without any problems since changeable sector size is an optional SCSI feature). Unsupported Response Length The critical point here lies in the importance of data that will be lost if a to short response length is used. Therefore it might be a good idea to let this solution be command dependent rather than general. Two possible solutions are: 1. Fill up the response with the requested amount of data and discard the rest. 2. Terminate the command by sending a check condition status byte and set the sense-key and/or the additional sense code to something good. For example illegal request and invalid field in CDB respectively. SCSI Command Queue If only one host is connected to the SCSI-bus at a time a command collision will not occur and a command queue is not needed. Otherwise if the SSATL and the SATA-device are executing a command while the SSATL receives a new one it can either terminate the newer one or enqueue it for later execution. A termination can be done by sending a check condition status byte to the initiator of the new command and set the sense-key and/or the additional sense code to something good. By doing this the SSATL force the initiator to act by either retransmit the request or simply do nothing. The second option is to implement some kind of SCSI-command queue, a FIFO that can hold one or more SCSI-requests each with a size of up to 16 bytes. This might lead to new problems like overflow of the FIFO, extra latency, and possible time-outs for the involved hosts. Of course one can design a combination of the queue and termination solutions which either enqueue only control based requests or starts rejecting requests only if the buffer is full. Mode Pages and Sense Data The translation dilemma between SATA error and SCSI error has a simple and straight forward mapping provided in the SCSI / ATA Translation - 2 [8]. It is based on the combined value of the status- and error-registers of SATA and how it shall be mapped to the SCSI-sense codes sense key and additional sense code. Besides from the common error reporting, the sense codes can be set at other events like when terminating SCSI commands. 4.2 SCSI / Serial ATA Translation Layer - SSATL 51 What SCSI has that SATA does not when it comes to error and condition reporting is so called mode pages. The SSATL has to emulate the supported pages (up to 64 pages defined by the mode sense/select’s 6-bit address field). The pages’ data has to be stored in registers during reception of identify device and different kinds of S.M.A.R.T-related SATA commands. The registers shall be updated on new receptions of such SATA-commands and when the SSATL receives a valid mode select (targeting a SATA-changeable parameter and that the addressed page is supported). If the mode sense/select is not valid the SSATL shall terminate the command by sending the status byte. S.M.A.R.T Support If the device supports S.M.A.R.T-commands it can be used to update the mode pages (see above). ATA Pass-Through Commands The ATA pass-through commands are two commands, ATA pass-through(12) and ATA pass-through(16), and are defined in SCSI / ATA Translation - 2 [8]. It also defines an ATA return descriptor and a specific ATA mode page. The great advantage of using ATA pass-through commands is that it provides the possibility of using every SATA command without any emulation for the translation step for these SCSI-commands. The drawback is that not all SCSI controllers supports the technology and by using these commands the SSATL becomes ”visible” for the SCSI application client. Chapter 5 Results and Discussion This chapter presents results and comments from the simulations and HW-tests performed on the SATA-host design and implementation. The last section is a discussion about the SSATL block. 5.1 SATA-host Result The final SATA-host design (see Appendix D) ended up with approximately 9000 rows of VHDL code (without testbenches and testcases). The host’s phy, link, and transport layer are working and supporting the mandatory demands of the SATA protocol. The application layer design on the other hand is slight compared to a fully functional SATA application layer, due to lack of time. For example it does not support all the specific SATA features (e.g. NCQ) and the complete SW has been modeled with a FSM performing a fixed sequence of commands. 5.1.1 Simulations This section is a summary of the results given during simulation (see Section 3.6 for more background information) of the SATA-host system. All these simulations finally passed. Since the system is intended to be used in a bridge solution the latency of signals is crucial. So the results presented is targeted at time latency in different parts of the system. Below are the outcome of simulations of DMA read, DMA write, and different kinds of PIO commands targeting the shadow registers. The results are ordered in Table 5.2 and Table 5.3. For a short description of the result tables see Table 5.1. 53 54 Results and Discussion Table 5.1. Definitions for simulation results Definition System Direction ∆ cycles time [ns] generation 1 time [ns] generation 2 Description Corresponds to which subsystem/system that is in focus Defines if it is input/output that is in focus Latency in clock cycles (Dword domain) Latency in time for generation 1 Latency in time for generation 2 Table 5.2. DMA - simulation summary System Direction ∆ cycles time [ns] generation 1 time [ns] generation 2 Physical Layer Link Layer Transport Layer Application Layer Complete System In Out In Out In Out In Out In Out 6 6 4 3 9 11-45 1 1 15 16-51 26.67 26.67 106.7 80.00 240.0 293.3-1200 26.67 26.67 400.0 426.7-1360 13.33 13.33 53.33 40.00 120.0 146.7-600.0 13.33 13.33 200.0 213.3-680.0 Table 5.3. Shadow registers - simulation summary System Direction ∆ cycles time [ns] generation 1 time [ns] generation 2 Physical Layer Link Layer Transport Layer Application Layer Complete System In Out In Out In Out In Out In Out 6 6 4 3 11-400 11-45 4 4 20-409 19-53 26.67 26.67 106.7 80.00 293.3-10667 293.3-1200 106.7 106.7 533.3-10907 506.7-1413 13.33 13.33 53.33 40.00 146.7-5333 146.7-600.0 53.33 53.33 266.7-5453 253.3-706.7 For the Phy layer the GTP actions stands for five of the six cycles and can therefore 5.1 SATA-host Result 55 not be reduced. The allowed latency for flow control is 20 Dword cycles which means that on reception of a HOLDP the receiver has 20 Dword cycles of time until a HOLDAP response shall be on the wire. In this case: • GTP actions i.e. parallelize bit stream to 20 bit groups, and decode them with 8b/10b resulting in data words. This takes 5 Dword clock cycles. • Pair words to Dwords, 1 Dword clock cycle. • Unsuppression, descrambling, and primitive detection, 3 Dword cycles. • Send primitive, suppress, and scramble, 3 Dword cycles • Split Dword into pair of words, 1 Dword cycle • GTP actions, 5 Dword cycles This results in a total of 18 Dword clock cycles which correspond to the standard. The results, in both cases, clearly shows fluctuating output latency for the transport layer. The simple explanation is that the input has priority and the SATA-link only provides data traffic in one direction at a time. So if the link layer (that stalls the transport layer) is idle when the output is initiated, then the latency corresponds to the lower boundary. The extra latency is a result of the link layer flow control and is defined in the SATA-standard [2] and can not be reduced in a large scale (perhaps a few cycles can be spared in state transitions). Therefore the interesting latency is the lower bound. Like for the output also the input at the transport layer has a non-constant latency, this depends on differences between complexity of FIS handling that uses the shadow registers. The largest latency gap appears when using a PIO-in command like identify device since each Dword must be divided into two words due to the data shadow register’s lack of depth (this register is the initial target of the PIO-in commands). So the added latency of the implementation is less than one µs/Dword for both DMA and PIO commands. A big issue with the simulations has been the testbench and its complexity. A simple testbench is preferable since that minimizes bugs in its code, but here the testbench had to act as a SATA-device and be able to perform and answer to primitives etc. To get rid of making an entire SATA-device design as well, the testbench had to be simplified, but still making sure that it could act upon different kinds of flow control situations. The testbench design now appears more or less of while-loops making it very sensitive to timing. This has lead to the situation where error correction of the testbench has been almost as time consuming as for error correction of the SATA-host design. 56 Results and Discussion 5.1.2 HW-Tests This section describes the results of the HW-tests presented in Section 3.7. There is also a short discussion about the results of the two tests. A synthesis report of the implementation used during testing can be seen in Appendix E. Part I The results of the HW-tests (see Section 3.7.1) can be seen in Table 5.4. The Ozctec. steady state drive completed the entire OOB and speed negotiation procedure but at first never managed to send a device signature. The problem were at the phy-layer, that after speed negotiation started to send R_RDYP instead of SY N CP . The R_RDYP should not be sent until a X_RDYP has been received from the device. Table 5.4. Results of the HW-tests, part I Test OOB OOB speed neg. speed neg. Cable 0.5 m 1.0 m 0.5 m 1.0 m Samsung Y* Y* Y* Y* WD Y Y Y Y Ocztec. Y Y Y Y Hitachi Y Y Y Y WD stands for Western Digital, Y for Yes and N for No. *OOB-signaling often stalls in sending dial tone (see d10.2 in Figure 2.4) state when trying to establish connection with the Samsung disk The problem with the Samsung disk was never discovered during this test. At this point possible reasons were discussed. For example it could depend on some protocol side-step in the Phy design which the other disks does not respond to (like in the Ocztech. case before). Another explanation might be that the Samsung disk perhaps has lower voltage threshold on its differential link and therefore loses the connection to the SATA-host or that the differential signal from the Samsung disk is weaker than the other disks’ signals making it appear non-valid for the host. Part II The result of the second HW-test (see Section 3.7.2) can be seen in Table 5.5. The Samsung disk showed the same problem as in the part I test. It completed the test for some runs, but for some it was terminated and for some it never completed the HW-initialization. A late test-run indicated that pairs of ALIGNP were not sent to the device for each 256 Dword clock cycles, due to protocol violations in the phy and link layers. After a fix of this shortcoming the Samsung disk started to work properly. So now the part I test can be considered successful since it is a part of Test II as well. 5.2 SSATL - Discussion 57 Table 5.5. Results of the HW-tests, part II Test SW-init SW-init DMA DMA Cable 0.5 m 1.0 m 0.5 m 1.0 m Samsung Y Y Y Y WD Y Y Y Y Ocztec. Y Y N N Hitachi Y Y Y Y WD stands for Western Digital, Y for Yes and N for No. The SW-initialization was completed without further problems for all the disks. When it comes to the DMA action, the Oczthec. steady state disk appears rather fishy. During the DMA write execution everything seems right until about 20 Dwords have been sent, then the disk starts sending HOLDP primitives. The number of HOLDP is larger than 8192 primitives (8192 corresponds to the sample data depth of ChipScope) so the entire sequence can not be monitored during a single test. The host keeps on sending up to 51 Dwords of data before it gets the first HOLDP . Nothing goes wrong here, after the HOLDP sequence the data transfer is completed, rest of the data are transferred and a correct CRC is sent to the disk. Every thing seems correct during the primitive flow control but at the end the device responds with a R_ERRP indicating an error in the payload. Later when trying to read the same sector as written the host receives the first 51 Dwords of data (looking right) but for the rest of the sector the disk returns something that looks like a factory testing pattern, the transfer ends without any errors but the received data is malformed compared to the one sent. The problem might be caused by either a protocol violation or the solid-state disk’s physical differences compared to a normal SATA disk, like its physical sector size that is 4096 bytes compared to 512 bytes. The problem clearly indicates that payload content has been added, lost, or malformed after the CRC calculation in the host. I base this on the fact that the CRC is correct for the sent payload and that the error is indicated with a R_ERRP and not with a SY N CP . A buffer overflow can for example result in lost data and might be caused by the extra Dwords sent before the device receives a HOLDAP . If this is the case the only thing to do with the SATA-host design is to try to reduce the flow control latency mentioned in Section 5.1.1 above. 5.2 SSATL - Discussion A commercial SSATL should support all the required commands specified in SPC [12], SBC [4], and the ATA pass through commands from SAT [8]. The primary command set discussed in this report (see Section 2.2.3 for further information) covers around 12% of full command functionality but still provides a solid starting point for tests. 58 Results and Discussion Apart from the commands also a lot of page codes, parameters, and mode pages has to be supported. This takes a great deal of emulation to make it work properly enough. A possible implementation of a SSATL will be large but not necessary complex, since each one of the approximately 90 commands has its own and unique translation and/or emulation (that is straight forward in most cases). Chapter 6 Future Work This chapter deal with how to proceed with the work already done and how it can be improved. 6.1 SATA Host When it comes to the SATA host, improvements can be done in all the layers: • Phy layer: The design is a Xilinx application translated from Verilog to VHDL and is only tested by Xilinx during HW-initialization. The complete design can therefore be rewritten with an own design to make sure of its functionality during normal SATA operations. • Link layer: Add the optional power management FSM and extend the error reporting to higher layers. • Transport layer: Add support for built in self test (BIST) and extend the error reporting to the application layer. • Application layer: This layer need a lot of more work for a fully SATA functionality. One thing that has to be done to support full functionality is to extend the command layer to support some of the optional SATA commands. Another is to implement a real SW, for example in the Virtex-5’s soft processor, the Microblaze. The complete design can also be upgraded to support the new SATA 3.0 standard supporting transfer speeds up to 6.0 Gbps (the new revision will be released by SATA-IO in the first half of 2009 [6]). But for a SATA 3.0 implementation the Virtex-5 FPGA used during this thesis has to be exchanged for one Virtex5 FPGA from either the FXT or TXT series. The FXT and TXT FPGAs has gigabit transceivers supporting speeds up to 6.5 Gbps while the current FPGA’s GTPs supports up to 3.75 Gbps. The SATA-host design uses only one GTP and could be extended with additional 59 60 Future Work transport-, link-, and physical-layers for the remaining GTPs. This together with properly adjustments to the application layer can enable port multiplier or RAID features (see Figure 6.1). RAID stands for redundant array of independent disks and is a technology used for increased performance (more than one disk used as a single device) or backup (same data written to more than one disk at a time). A port multiplier is a SATA switch that enables a host to reach more than only one device. Figure 6.1. Virtex-5 in a SATA RAID or Port multiplier solution 6.2 SSATL The SSATL is so far a model and a number of design proposals, so the continuation is to: • Investigate exactly which SCSI-support the SSATL shall provide concerning commands, parameters, error reporting, mode pages, and so on. • Complete the translation/emulation of the chosen SCSI-support. • SSATL system design. • Implement the design. • Test and verify the implementations functionality. If the intended SATA subsystem is a port multiplier the device numbers can be mapped to the corresponding SCSI logical unit numbers (LUN). 6.3 SCSI to SATA Bridge 6.3 61 SCSI to SATA Bridge Except the changes and improvements mentioned in Sections 6.1 and 6.2 an arbitrary SCSI interface is needed for connection to a SCSI-bus. For example if the bus is the Internet (using iSCSI) the additional blocks are an Ethernet block (physical link) and a TCP/IP block handling that protocol. When the bridging is complete additional features can be added to the bridge, encryption in Sectra’s case. To avoid effecting control by changing for example primitives or commands, encryption can advantageous be applied at either: • The SSATL where the payload is easy separated from CDBs and other data or • the SATA transport layer where knowledge of FISes can be used to only encrypt the payload of the data-FISes. Chapter 7 Conclusions The outcome of this thesis, except personal experiences, is a HW SATA-host and a translation model for SCSI- to SATA-commands. The design solution for the SATA host is by modifications of the application layer adaptable to various applications since the phy, link, and transport layer are the same no matter the usage situation and easy to include layer by layer. The latency for the SATA host has been proved by simulations to be less than one µs/Dword. Compare the latency with for example Windows XP’s ATA time-out of 10 s (see Microsoft support [11]) and the latency of the implementation should not pose much of a problem for most applications. This makes the design suitable for bridges like the SATA-to-SATA or SCSI-to-SATA but does not exclude other alternatives. The SATA host is compatible with both generation 1 and generation 2 of SATA with all its code written in Sectra’s coding standard. The implementation is synthesized on a ml505 platform but by changing the user constraint file (UCF) it is easy to change platform as long as it can provide SATA connections and is based upon a Virtex-5 FPGA. The host works in stand-alone mode; it does not need external control or settings and is independent of extra equipments. This enables the application client to be unaware of a contigent bridge, making it transparent. Concerning the SSATL and the translation model it can be used as a starting point for a future design. The model states some general rules but also some detailed examples of how a translation might be approached. Still the translation between SCSI and SATA has great challenges to face like SCSI’s greater command set, layer design, and HW-support for a SCSI-interface. When it comes to HW the FPGAs in the Virtex-5 family are great pieces of equipment for the task and as can be seen in Appendix E there is unused resources that should be enough for further implementations such as a SSATL and SCSI interface or an extra SATA interface. Some of the Virtex-5 FPGAs even support 63 64 Conclusions gigabit transceivers supporting 6.5 Gbps and therefore can be used for the new third generation of SATA devices. Finally the potential market looks interesting when keeping in mind that SATA is the most common standard for mass storage of data in new computers. Combine it with the fact that many companies and authorities handles restricted or secret information and often uses SCSI solutions already. So by letting the bridge be the ”interpreter” between protocols and adding an encryption module gives costumers a secure and fast solution for mass storage of their data. Now loss of a disk does not mean that an outsider can read the information. 65 Bibliography [1] Figure over the scsi architecture. URL: http://www.t10.org/scsi-3.htm, 2009-01-08. [2] Don Anderson. SATA Storage Technology. MindShare, Inc., 2007. ISBN 978-0-9770878-1-5. [3] Matt DiPaolo and Simon Tam. Serial ATA Physical Link Initialization with the GTP Transeiver of Virtex-5 LXT FPGAs. Xilinx, Inc., XAPP870 Version 1.0, 3 January 2008. URL: http://www.xilinx.com/support/documentation/application_notes/ xapp870.pdf. [4] Mark Evans. Information technology - SCSI Block Commands - 3 (SBC-3). American National Standards Institute, Revision 17, 17 November 2008. ISO/IEC 14776-323 : 200x, URL: http://www.t10.org/cgi-bin/ac.pl?t=f&f=sbc3r17.pdf. [5] Nathan Marushak and Roger Jeppsen. Translating scsi into sata for the best of both worlds. URL: http://www.snwonline.com/evaluate/translating_06-21-04.asp, 2008-11-17., Posted, 21 June 2004. [6] Serial ATA International Organization. Detailed information about third generation of sata. URL: http://www.sata-io.org/6gbdetails.asp, 2009-01-16. [7] Serial ATA International Organization. Serial ATA International Organization: Serial ATA Revision 2.5. Serial ATA International Organization, 2005. URL: http://www.sata-io.org. [8] Mark A. Overby. Information Technology - SCSI/ATA Translation - 2 (SAT-2). American National Standards Institute, Revision 05, 22 June 2008. URL: http://www.t10.org/cgi-bin/ac.pl?t=f&f=sat2r06.pdf. [9] George Penokie. Information technology - SCSI Architecture Model - 4 (SAM-4). American National Standards Institute, Revision , 14 May 2008. ISO/IEC 14776-414:200x, URL: http://www.t10.org/cgi-bin/ac.pl?t=f&f=sam4r14.pdf. 67 68 Bibliography [10] Friedhelm Schmidt. The SCSI Bus and IDE Interface Protocols, applications and programming. Addison-Wesley, 1997. ISBN 0-201-17514-2. [11] Microsoft Support. Ide ata and atapi disks use pio mode after multiple time-out or crc errors occur. URL: http://support.microsoft.com/kb/817472, 2009-01-08., Revision 7.14 Posted, 3 December 2007. [12] Ralph O. Weber. Information technology - SCSI Primary Commands - 3 (SPC-3). American National Standards Institute, Revision 23, 4 May 2005. ISO/IEC 14776-313 : 200x, URL: http://www.t10.org/cgi-bin/ac.pl?t=f&f=spc3r23.pdf. [13] Xilinx. Ml505/506 four gtps ibert quickstart. URL: http://www.xilinx.com/products/boards/ml505/ml505_10.1_3/docs/ ml505_ibert_4gtps_quickstart.pdf, 2008-12-04. [14] Xilinx. ML505/ML506/ML507 Evaluation Platform User Guide. Xilinx, Inc., UG347 Version 3.0.1, 21 July 2008. URL: http://www.xilinx.com/support/documentation/boards_and_kits/ ug347.pdf. [15] Xilinx. Virtex5 Family Overview. Xilinx, Inc., DS100 Version 4.4, 23 September 2008. URL: http://www.xilinx.com/support/documentation/data_sheets/ds100.pdf. [16] Xilinx. Virtex-5 FPGA RocketIO GTP Transceiver User Guide. Xilinx, Inc., UG196 Version 1.7, 23 September 2008. URL: http://www.xilinx.com/support/documentation/user_guides/ug196.pdf. [17] Xilinx. Virtex-5 FPGA User Guide. Xilinx, Inc., UG190 Version 4.2, 9 May 2008. URL: http://www.xilinx.com/support/documentation/user_guides/ug190.pdf. Appendix A SATA Primitives This appendix is for clarification concerning SATA primitives. Table A.2 enlights the primitives encoding and hexadecimal representation, see Example A.1 for how to calculate the hexadecimal value. A brief description of every SATA primitive can be found in Table A.1. Example A.1: Encoding of SY N CP All byte information in this example can be found in Table A.2 Start with byte 0, for SY N CP this is K28.3. The K indicates that this byte shall be encoded as a k-character in the 8b/10b-encoder and a D indicates that its an ordinary byte. Now to the number combination 28.3, the 28 corresponds to the five least significant bits in the byte and the 3 corresponds to the three most significant bits of the byte. This gives us: 3 = 011 and 28 = 11100 which yields 01111100 = 0x7C. Repeat this for the rest of the bytes and you end up with 0xB5B5957C. 69 70 SATA Primitives Table A.1. Description of primitives Primitive ALIGNP CON TP Name Physical Layer Control Continue repeating previous primitive DM ATP EOFP DMA terminate End of frame HOLDP Hold data transmission HOLDAP P M ACKP Hold acknowledge Power management acknowledge P M N AKP Power denial P M REQ_PP Power management request to partial P M REQ_SP Power management request to slumber R_ERRP Reception error R_IPP Reception in progress R_OKP R_RDYP Reception with no error Receiver ready SOFP Start of frame SY N CP W T RMP Synchronization Wait for frame termination X_RDYP Transmission ready management data Short description Used to readjust the Phy, always sent in even numbers. Used as to indicate primitive suppression of earlier nonALIGNP primitive. Terminates DMA transfer. Indicates that a FIS and CRC has been sent. Sent by transmitter when the next data is not ready to send or by the receiver if its rx-FIFO is full. Sent as response to HOLDP . Response to P M REQ_PP or P M REQ_SP if sender is ready to change power mode. Response to P M REQ_PP or P M REQ_SP if sender is not ready to change power mode or if its not supported. Transceiver goes to Partial power mode if P M ACKP is received as a response. Transceiver goes to Slumber power mode if P M ACKP is received as a response. Indicates error in received payload. Indicate that payload is being recieved. Indicates that payload was received without error. Indicates that receiver is ready to receive payload. Indicates that the next nonprimitive Dword is first Dword of a payload. Synchronize host and device. Sent after EOFP until R_OKP or R_ERRP is received. Indicates that transmitter has payload ready for transfer. 71 Table A.2. Primitive encoding Primitive ALIGNP CON TP DM ATP EOFP HOLDP HOLDAP P M ACKP P M N AKP P M REQ_PP P M REQ_SP R_ERRP R_IPP R_OKP R_RDYP SOFP SY N CP W T RMP X_RDYP Byte 3 D27.3 D25.4 D22.1 D21.6 D21.6 D21.4 D21.4 D21.7 D23.0 D21.3 D22.2 D21.2 D21.1 D10.2 D23.1 D21.5 D24.2 D23.2 Byte 2 D10.2 D25.4 D22.1 D21.6 D21.6 D21.4 D21.4 D21.7 D23.0 D21.3 D22.2 D21.2 D21.1 D10.2 D23.1 D21.5 D24.2 D23.2 Byte 1 D10.2 D10.5 D21.5 D21.5 D10.5 D10.5 D21.4 D21.4 D21.5 D21.4 D21.5 D21.5 D21.5 D21.4 D21.5 D21.4 D21.5 D21.5 Byte 0 K28.5 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 K28.3 hex value 0x7B4A4ABC 0x9999AA7C 0x3636B57C 0xD5D5B57C 0xD5D5AA7C 0x9595AA7C 0x9595957C 0xF5F5957C 0x1717B57C 0x7575957C 0x5656B57C 0x5555B57C 0x3535B57C 0x4A4A957C 0x3737B57C 0xB5B5957C 0x5858B57C 0x5757B57C Appendix B Mandatory SATA Commands SATA commands supported by the host design are equivalent with the mandatory SATA command set (see Table B.1). The set is divided into six groups depending on command characteristics. Each command in a group are handled in a similar way by the SATA command layer. To fully support all SATA-specific features (e.g. NCQ) the command set has to be extended. Table B.1. Mandatory SATA Commands Group DMA in DMA out PIO in PIO in PIO in PIO out PIO out Non-data Non-data Non-data Non-data Non-data Non-data Non-data Non-data Non-data Non-data Device Diagnostic Command Read DMA Write DMA Read Multiple Identify Device Read Sector Write Sector Write Multiple Read Verify Sector Set Multiple Mode Standby Immediate Idle Immediate Standby Idle Check Power Mode Sleep Flush Cache Set Features Execute Device Diagnostics 73 Code 0xC8 0xCA 0xC4 0xEC 0xED 0x30 0xC5 0x40 0xC6 0xE0 0xE1 0xE2 0xE3 0xE5 0xE6 0xE7 0xEF 0x90 Appendix C Test Equipments For realizing the HW-tests the following equipment and software have been used: • Xilinix Virtex-5 ML505 Evaluation Platform, with a Virtex-5 XC5VLX50T-FFG1136 • Xilinx Platform Cable USB II, model DLC10 • Xilinx ChipScope Pro Analyzer, version 10.1 • Samsung HD250HJ • Western Digital WD3200AAKS-00B3A0 • Ocztechnology OCZSSD2-1C32G • Hitachi HTS541680J9SA00 • SATA cable, 0.5 m and 1.0 m • Power supplies for disks and ml505 For further information concerning Virtex-5 see the user guides UG190 [17] and UG196 [16] and for ml505 see user guide UG347 [14]. Figure C.1 (modified figure from [13]) shows an overview of the ml505 board and its important connections and jumpers for a proper SATA configuration. 75 76 Test Equipments Figure C.1. The ml505 Evaluation Platform Appendix D Final SATA Host Design The final SATA host design developed can be seen here in Figure D.1. The design is supposed to be used on a Virtex-5 FPGA and is simulated in Modelsim 6.3c and tested running on a Xilinx ml505 Evaluation Platform. 77 78 Final SATA Host Design Figure D.1. Detailed SATA-host overview Appendix E Synthesis Report This appendix shows the synthesis report for the design. The report shows the usage of the FPGA and its components. This synthesis report was automatically generated during synthesis of the SATA host’s VHDL code. For a clarification of the reports content see Table E.1. Table E.1. Definitions for synthesis report Definition Slice IOB LUT Description Cluster of elements (e.g. flip flops) in the FPGA Input/Output Block, a programmable connection point to the FPGA Look Up Table Observe that the ChipScope components are included in this synthesis making the numbers delusive in some cases. Especially notice the Specified Feature Utilization section which indicates a memory usage of over 80 % for the implementation when the real usage without ChipScope components are 1 %. 79 80 Synthesis Report ---------------------------------------------------------------------------------------------Release 10.1.03 Map K.39 (lin) Xilinx Mapping Report File for Design ’top’ Design Information -----------------Command Line : map -ol high -o mapped.ncd top.ngd Target Device : xc5vlx50t Target Package : ff1136 Target Speed : -1 Mapper Version : virtex5 -- $Revision: 1.46.12.2 $ Mapped Date : Mon Dec 15 14:07:23 2008 Design Summary -------------Number of errors: 0 Number of warnings: 112 Slice Logic Utilization: Number of Slice Registers: Number used as Flip Flops: Number of Slice LUTs: Number used as logic: Number using O6 output only: Number using O5 output only: Number using O5 and O6: Number used as Memory: Number used as Shift Register: Number using O6 output only: Number using O5 and O6: Number used as exclusive route-thru: Number of route-thrus: Number using O6 output only: Number using O5 output only: Number using O5 and O6: Slice Logic Distribution: Number of occupied Slices: Number of LUT Flip Flop pairs used: Number with an unused Flip Flop: Number with an unused LUT: Number of fully used LUT-FF pairs: Number of unique control sets: Number of slice register sites lost to control set restrictions: 2,973 2,973 4,084 3,801 3,185 365 251 249 249 248 1 34 400 398 1 1 out of 28,800 10% out of out of 28,800 28,800 14% 13% out of 7,680 3% out of 57,600 1% out of 7,200 26% out of out of out of 5,043 5,043 5,043 41% 19% 39% 225 out of 28,800 1% 1,896 5,043 2,070 959 2,014 122 81 A LUT Flip Flop pair for this architecture represents one LUT paired with one Flip Flop within a slice. A control set is a unique combination of clock, reset, set, and enable signals for a registered element. The Slice Logic Distribution report is not meaningful if the design is over-mapped for a non-slice resource or if Placement fails. IO Utilization: Number of bonded IOBs: Number of bonded IPADs: Number of bonded OPADs Specific Feature Utilization: Number of BlockRAM/FIFO: Number using BlockRAM only: Number using FIFO only: Total primitives used: Number of 36k BlockRAM used: Number of 18k BlockRAM used: Number of 18k FIFO used: Total Memory used (KB): Number of BUFG/BUFGCTRLs: Number used as BUFGs: Number used as BUFGCTRLs: Number of BSCANs: Number of BUFDSs: Number of DCM_ADVs: Number of GTP_DUALs: 20 out of 4 2 480 4% 51 out of 49 2 60 85% 2,160 32 82% 31% 4 6 12 6 25% 16% 16% 16% 48 1 2 1,782 10 7 3 1 1 2 1 out of out of out out out out of of of of Number of RPM macros: 11 Peak Memory Usage: 471 MB Total REAL time to MAP completion: 1 mins 52 secs Total CPU time to MAP completion: 1 mins 52 secs ----------------------------------------------------------------------------------------------